2025-10-10T00:32:19.5906480Z Current runner version: '2.328.0' 2025-10-10T00:32:19.5912179Z Runner name: 'gpud501' 2025-10-10T00:32:19.5912862Z Runner group name: 'linux.rocm.gpu.group' 2025-10-10T00:32:19.5913655Z Machine name: 'gpud501' 2025-10-10T00:32:19.5916791Z ##[group]GITHUB_TOKEN Permissions 2025-10-10T00:32:19.5919191Z Contents: read 2025-10-10T00:32:19.5919821Z Metadata: read 2025-10-10T00:32:19.5920318Z ##[endgroup] 2025-10-10T00:32:19.5922973Z Secret source: Actions 2025-10-10T00:32:19.5923903Z Prepare workflow directory 2025-10-10T00:32:20.3338920Z Prepare all required actions 2025-10-10T00:32:20.3419922Z Getting action download info 2025-10-10T00:32:20.6750263Z Download action repository 'pytorch/pytorch@main' (SHA:a6fa4f9c283971c0fb6f60a89674a1f35370ac79) 2025-10-10T00:32:25.1163147Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-10-10T00:32:25.6051507Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-10-10T00:32:25.9952101Z Download action repository 'pytorch/test-infra@main' (SHA:264eed5d70b428e3aa5c1a7c98e4330f866e183f) 2025-10-10T00:32:27.0393041Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-10-10T00:32:27.7376769Z Getting action download info 2025-10-10T00:32:27.8903403Z Download action repository 'actions/checkout@v4' (SHA:08eba0b27e820071cde6df949e0beb9ba4906955) 2025-10-10T00:32:28.3821509Z Getting action download info 2025-10-10T00:32:28.5185129Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-10-10T00:32:28.9590277Z Getting action download info 2025-10-10T00:32:29.1273249Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (344e6365a0068c2d2847fcec0c55dd53291d475e) 2025-10-10T00:32:29.1281103Z ##[group] Inputs 2025-10-10T00:32:29.1281711Z build-environment: linux-jammy-rocm-py3.10 2025-10-10T00:32:29.1284247Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.2"}]} 2025-10-10T00:32:29.1287406Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:29.1288574Z sync-tag: 2025-10-10T00:32:29.1289958Z timeout-minutes: 300 2025-10-10T00:32:29.1290386Z tests-to-include: 2025-10-10T00:32:29.1290760Z dashboard-tag: 2025-10-10T00:32:29.1291724Z disable-monitor: true 2025-10-10T00:32:29.1292205Z monitor-log-interval: 5 2025-10-10T00:32:29.1292690Z monitor-data-collect-interval: 1 2025-10-10T00:32:29.1293197Z ##[endgroup] 2025-10-10T00:32:29.1293823Z Complete job name: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:32:29.3060728Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-10-10T00:32:29.3061346Z with: 2025-10-10T00:32:29.3061530Z no-sudo: true 2025-10-10T00:32:29.3061737Z submodules: recursive 2025-10-10T00:32:29.3061976Z fetch-depth: 0 2025-10-10T00:32:29.3062306Z env: 2025-10-10T00:32:29.3062480Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:29.3062669Z ##[endgroup] 2025-10-10T00:32:29.3136088Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:32:29.3136839Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:32:29.3169854Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:29.3170164Z env: 2025-10-10T00:32:29.3170328Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:29.3170537Z ##[endgroup] 2025-10-10T00:32:29.3512177Z ##[group]Run # Use all available CPUs for fetching 2025-10-10T00:32:29.3513005Z # Use all available CPUs for fetching 2025-10-10T00:32:29.3513581Z cd "${GITHUB_WORKSPACE}" 2025-10-10T00:32:29.3514339Z git config --global fetch.parallel 0 2025-10-10T00:32:29.3515027Z git config --global submodule.fetchJobs 0 2025-10-10T00:32:29.3515628Z  2025-10-10T00:32:29.3516246Z # Clean workspace. The default checkout action should also do this, but 2025-10-10T00:32:29.3517027Z # do it here as well just in case 2025-10-10T00:32:29.3517574Z if [[ -d .git ]]; then 2025-10-10T00:32:29.3518067Z  if [ -z "${NO_SUDO}" ]; then 2025-10-10T00:32:29.3518567Z  sudo git clean -ffdx 2025-10-10T00:32:29.3519060Z  else 2025-10-10T00:32:29.3519453Z  git clean -ffdx 2025-10-10T00:32:29.3519910Z  fi 2025-10-10T00:32:29.3520306Z fi 2025-10-10T00:32:29.3571849Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:29.3572438Z env: 2025-10-10T00:32:29.3572775Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:29.3573155Z NO_SUDO: true 2025-10-10T00:32:29.3573490Z ##[endgroup] 2025-10-10T00:32:29.8024047Z Removing .additional_ci_files/ 2025-10-10T00:32:29.8024828Z Removing build/ 2025-10-10T00:32:29.8025262Z Removing dist/ 2025-10-10T00:32:29.8025742Z Removing test/test-reports/ 2025-10-10T00:32:29.8175857Z ##[group]Run actions/checkout@v4 2025-10-10T00:32:29.8176494Z with: 2025-10-10T00:32:29.8177068Z ref: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:29.8177783Z fetch-depth: 0 2025-10-10T00:32:29.8178279Z submodules: recursive 2025-10-10T00:32:29.8178847Z show-progress: false 2025-10-10T00:32:29.8179395Z repository: pytorch/pytorch 2025-10-10T00:32:29.8180135Z token: *** 2025-10-10T00:32:29.8180551Z ssh-strict: true 2025-10-10T00:32:29.8180943Z ssh-user: git 2025-10-10T00:32:29.8181361Z persist-credentials: true 2025-10-10T00:32:29.8181812Z clean: true 2025-10-10T00:32:29.8182264Z sparse-checkout-cone-mode: true 2025-10-10T00:32:29.8182797Z fetch-tags: false 2025-10-10T00:32:29.8183178Z lfs: false 2025-10-10T00:32:29.8183570Z set-safe-directory: true 2025-10-10T00:32:29.8184000Z env: 2025-10-10T00:32:29.8184369Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:29.8184811Z ##[endgroup] 2025-10-10T00:32:29.9538504Z Syncing repository: pytorch/pytorch 2025-10-10T00:32:29.9541471Z ##[group]Getting Git version info 2025-10-10T00:32:29.9542346Z Working directory is '/var/home/pytorchci/actions-runner/_work/pytorch/pytorch' 2025-10-10T00:32:29.9543488Z [command]/usr/bin/git version 2025-10-10T00:32:29.9543918Z git version 2.34.1 2025-10-10T00:32:29.9545386Z ##[endgroup] 2025-10-10T00:32:29.9551517Z Copying '/var/home/pytorchci/.gitconfig' to '/var/home/pytorchci/actions-runner/_work/_temp/b0a5e1c8-29c8-4072-af5e-ccbdda7d589d/.gitconfig' 2025-10-10T00:32:29.9560349Z Temporarily overriding HOME='/var/home/pytorchci/actions-runner/_work/_temp/b0a5e1c8-29c8-4072-af5e-ccbdda7d589d' before making global git config changes 2025-10-10T00:32:29.9561896Z Adding repository directory to the temporary git global config as a safe directory 2025-10-10T00:32:29.9563148Z [command]/usr/bin/git config --global --add safe.directory /var/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-10-10T00:32:29.9638768Z [command]/usr/bin/git config --local --get remote.origin.url 2025-10-10T00:32:29.9681983Z https://github.com/pytorch/pytorch 2025-10-10T00:32:29.9703733Z ##[group]Removing previously created refs, to avoid conflicts 2025-10-10T00:32:29.9705387Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-10-10T00:32:29.9760814Z HEAD 2025-10-10T00:32:29.9846674Z ##[endgroup] 2025-10-10T00:32:29.9850671Z [command]/usr/bin/git submodule status 2025-10-10T00:32:30.0650209Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-10-10T00:32:30.0881372Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-10-10T00:32:30.1104205Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-10-10T00:32:30.1355781Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-10-10T00:32:30.1491243Z 2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07 third_party/NVTX (v3.1.0-263-g2942f16) 2025-10-10T00:32:30.1693134Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-10-10T00:32:30.2605342Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-10-10T00:32:30.2703726Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-10-10T00:32:30.2776901Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-10-10T00:32:30.2964612Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-10-10T00:32:30.3289199Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-10-10T00:32:30.3603853Z 5e3d2445e6a84d9599bee2bf78edbb4d80865e1d third_party/cpuinfo (5e3d244) 2025-10-10T00:32:30.3704845Z f937055efc6d414d11f4c6577e3977fe74f35fb6 third_party/cudnn_frontend (v0.5-52-gf937055) 2025-10-10T00:32:30.3972188Z f3fde58372d33e9a5650ba7b80fc48b3b49d40c8 third_party/cutlass (v4.2.1) 2025-10-10T00:32:30.4136521Z 3cefe0564a8c3de514a152d40a2b4770f2ee5be0 third_party/fbgemm (v1.3.0-rc1-404-g3cefe056) 2025-10-10T00:32:30.4354821Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-10-10T00:32:30.4432361Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-10-10T00:32:30.5184353Z e424e3f2e607da02742f73db84873b8084fc714c third_party/fmt (12.0.0) 2025-10-10T00:32:30.5480764Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-10-10T00:32:30.5795231Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-10-10T00:32:30.6278401Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-10-10T00:32:30.6478052Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-10-10T00:32:30.6639564Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-10-10T00:32:30.7281634Z 001ba8eb519438592f79dbc8e86a349f5f6c6829 third_party/kineto (heads/main-6-g001ba8e) 2025-10-10T00:32:30.7351633Z cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7 third_party/kleidiai (v1.8.0) 2025-10-10T00:32:30.7420166Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-10-10T00:32:30.7489893Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-10-10T00:32:30.8169361Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-10-10T00:32:30.8242409Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-10-10T00:32:30.8318779Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-10-10T00:32:30.8985123Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-10-10T00:32:30.9209105Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-10-10T00:32:30.9363667Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-10-10T00:32:30.9435441Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-10-10T00:32:30.9664768Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-10-10T00:32:30.9866232Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-10-10T00:32:31.0085901Z af0118d13e52f5a08841464a768e01a0bf3e3075 third_party/tensorpipe (heads/main) 2025-10-10T00:32:31.0122715Z ##[group]Cleaning the repository 2025-10-10T00:32:31.0129296Z [command]/usr/bin/git clean -ffdx 2025-10-10T00:32:31.0758768Z [command]/usr/bin/git reset --hard HEAD 2025-10-10T00:32:31.2300298Z HEAD is now at 4d7f9f3aed6 Revert "[ATen] Fix CUDA reduction warp shuffle order (#164790)" 2025-10-10T00:32:31.2348192Z ##[endgroup] 2025-10-10T00:32:31.2351028Z ##[group]Disabling automatic garbage collection 2025-10-10T00:32:31.2360942Z [command]/usr/bin/git config --local gc.auto 0 2025-10-10T00:32:31.2436794Z ##[endgroup] 2025-10-10T00:32:31.2437714Z ##[group]Setting up auth 2025-10-10T00:32:31.2448173Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-10-10T00:32:31.2526216Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-10-10T00:32:31.3200549Z Entering 'android/libs/fbjni' 2025-10-10T00:32:31.3329048Z Entering 'third_party/FP16' 2025-10-10T00:32:31.3452297Z Entering 'third_party/FXdiv' 2025-10-10T00:32:31.3575289Z Entering 'third_party/NNPACK' 2025-10-10T00:32:31.3698772Z Entering 'third_party/NVTX' 2025-10-10T00:32:31.3823051Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:31.3945070Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:31.4096676Z Entering 'third_party/aiter' 2025-10-10T00:32:31.4214827Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:31.4354705Z Entering 'third_party/benchmark' 2025-10-10T00:32:31.4478566Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:31.4622579Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:31.4743284Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:31.4864753Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:31.4985845Z Entering 'third_party/cutlass' 2025-10-10T00:32:31.5126019Z Entering 'third_party/fbgemm' 2025-10-10T00:32:31.5247471Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:31.5359106Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:31.5491787Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:31.5603526Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:31.5733118Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:31.5845112Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:31.5954335Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:31.6078532Z Entering 'third_party/flash-attention' 2025-10-10T00:32:31.6196925Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:31.6324990Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:31.6462357Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:31.6591968Z Entering 'third_party/fmt' 2025-10-10T00:32:31.6719241Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:31.6837278Z Entering 'third_party/gloo' 2025-10-10T00:32:31.6958495Z Entering 'third_party/googletest' 2025-10-10T00:32:31.7081450Z Entering 'third_party/ideep' 2025-10-10T00:32:31.7200142Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:31.7332954Z Entering 'third_party/ittapi' 2025-10-10T00:32:31.7448570Z Entering 'third_party/kineto' 2025-10-10T00:32:31.7568927Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:31.7673494Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:31.7790756Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:31.7904334Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:31.8017194Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:31.8126593Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:31.8245203Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:31.8355458Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:31.8467795Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:31.8582122Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:31.8698337Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:31.8809509Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:31.8927273Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:31.9052341Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:31.9164383Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:31.9286153Z Entering 'third_party/kleidiai' 2025-10-10T00:32:31.9411561Z Entering 'third_party/mimalloc' 2025-10-10T00:32:31.9536979Z Entering 'third_party/nlohmann' 2025-10-10T00:32:31.9662887Z Entering 'third_party/onnx' 2025-10-10T00:32:31.9822296Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:31.9953753Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:32.0082886Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:32.0202893Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:32.0318660Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:32.0429337Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:32.0544589Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:32.0657212Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:32.0771165Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:32.0879934Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:32.0997890Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:32.1119478Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:32.1280880Z Entering 'third_party/pocketfft' 2025-10-10T00:32:32.1403925Z Entering 'third_party/protobuf' 2025-10-10T00:32:32.1531146Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:32.1646592Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:32.1771723Z Entering 'third_party/psimd' 2025-10-10T00:32:32.1894610Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:32.2018089Z Entering 'third_party/pybind11' 2025-10-10T00:32:32.2143353Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:32.2265911Z Entering 'third_party/sleef' 2025-10-10T00:32:32.2389477Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:32.2506258Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:32.2620473Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:32.2733348Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:32.2845302Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:32.2952893Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:32.3118641Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-10-10T00:32:32.3190305Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-10-10T00:32:32.3880509Z Entering 'android/libs/fbjni' 2025-10-10T00:32:32.4006698Z Entering 'third_party/FP16' 2025-10-10T00:32:32.4131741Z Entering 'third_party/FXdiv' 2025-10-10T00:32:32.4257617Z Entering 'third_party/NNPACK' 2025-10-10T00:32:32.4381303Z Entering 'third_party/NVTX' 2025-10-10T00:32:32.4507094Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:32.4629679Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:32.4784453Z Entering 'third_party/aiter' 2025-10-10T00:32:32.4906688Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:32.5044897Z Entering 'third_party/benchmark' 2025-10-10T00:32:32.5168799Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:32.5311742Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:32.5435995Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:32.5559353Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:32.5685692Z Entering 'third_party/cutlass' 2025-10-10T00:32:32.5829142Z Entering 'third_party/fbgemm' 2025-10-10T00:32:32.5954265Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:32.6071353Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:32.6203757Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:32.6317387Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:32.6449611Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:32.6561495Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:32.6673643Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:32.6794049Z Entering 'third_party/flash-attention' 2025-10-10T00:32:32.6911979Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:32.7036289Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:32.7173983Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:32.7298608Z Entering 'third_party/fmt' 2025-10-10T00:32:32.7419710Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:32.7539979Z Entering 'third_party/gloo' 2025-10-10T00:32:32.7658590Z Entering 'third_party/googletest' 2025-10-10T00:32:32.7776624Z Entering 'third_party/ideep' 2025-10-10T00:32:32.7886939Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:32.8017738Z Entering 'third_party/ittapi' 2025-10-10T00:32:32.8135218Z Entering 'third_party/kineto' 2025-10-10T00:32:32.8249900Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:32.8356972Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:32.8472907Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:32.8587867Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:32.8698002Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:32.8807569Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:32.8928624Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:32.9042302Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:32.9154750Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:32.9268148Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:32.9382341Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:32.9491454Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:32.9613528Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:32.9741087Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:32.9853180Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:32.9971809Z Entering 'third_party/kleidiai' 2025-10-10T00:32:33.0092432Z Entering 'third_party/mimalloc' 2025-10-10T00:32:33.0213017Z Entering 'third_party/nlohmann' 2025-10-10T00:32:33.0335824Z Entering 'third_party/onnx' 2025-10-10T00:32:33.0489197Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:33.0612775Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:33.0731216Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:33.0843100Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:33.0953608Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:33.1064107Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:33.1178104Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:33.1286595Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:33.1395229Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:33.1503182Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:33.1621859Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:33.1740188Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:33.1899917Z Entering 'third_party/pocketfft' 2025-10-10T00:32:33.2017515Z Entering 'third_party/protobuf' 2025-10-10T00:32:33.2141202Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:33.2257744Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:33.2380811Z Entering 'third_party/psimd' 2025-10-10T00:32:33.2502278Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:33.2622121Z Entering 'third_party/pybind11' 2025-10-10T00:32:33.2745508Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:33.2865224Z Entering 'third_party/sleef' 2025-10-10T00:32:33.2988389Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:33.3106274Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:33.3221796Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:33.3337281Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:33.3451753Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:33.3560121Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:33.3733346Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-10-10T00:32:33.3829566Z ##[endgroup] 2025-10-10T00:32:33.3830884Z ##[group]Fetching the repository 2025-10-10T00:32:33.3844449Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-10-10T00:32:34.0001539Z From https://github.com/pytorch/pytorch 2025-10-10T00:32:34.0002520Z - [deleted] (none) -> ciflow/h100-symm-mem/165086 2025-10-10T00:32:34.0814210Z - [deleted] (none) -> ciflow/inductor/164904 2025-10-10T00:32:34.0815528Z - [deleted] (none) -> ciflow/inductor/164905 2025-10-10T00:32:34.0816403Z - [deleted] (none) -> ciflow/inductor/164939 2025-10-10T00:32:34.0818100Z - [deleted] (none) -> ciflow/inductor/164955 2025-10-10T00:32:34.0819232Z - [deleted] (none) -> ciflow/inductor/164982 2025-10-10T00:32:34.0820613Z - [deleted] (none) -> ciflow/inductor/165069 2025-10-10T00:32:34.0822254Z - [deleted] (none) -> ciflow/trunk/164401 2025-10-10T00:32:34.0823624Z - [deleted] (none) -> ciflow/trunk/164904 2025-10-10T00:32:34.0826354Z - [deleted] (none) -> ciflow/trunk/164905 2025-10-10T00:32:34.0828630Z - [deleted] (none) -> ciflow/trunk/164939 2025-10-10T00:32:34.0829691Z - [deleted] (none) -> ciflow/trunk/164955 2025-10-10T00:32:34.0830622Z - [deleted] (none) -> ciflow/trunk/165069 2025-10-10T00:32:34.0831465Z - [deleted] (none) -> ciflow/trunk/165086 2025-10-10T00:32:34.0833033Z - [deleted] (none) -> ciflow/trunk/165095 2025-10-10T00:32:35.7478702Z + 366d198a244...9d89081c344 bf16_support_per_channel -> origin/bf16_support_per_channel (forced update) 2025-10-10T00:32:35.7503017Z ae8d1cea077..682123a9320 csl/larger_runner -> origin/csl/larger_runner 2025-10-10T00:32:35.7554641Z e21b0377568..e20c22f5398 gh/ColinPeppler/94/base -> origin/gh/ColinPeppler/94/base 2025-10-10T00:32:35.7562221Z 1d1f1485399..3aea73061f6 gh/ColinPeppler/94/head -> origin/gh/ColinPeppler/94/head 2025-10-10T00:32:35.7566550Z + c14872567f2...efff5f35b80 gh/ColinPeppler/94/orig -> origin/gh/ColinPeppler/94/orig (forced update) 2025-10-10T00:32:35.7573682Z af8b079f0e1..0235b4d15d4 gh/ColinPeppler/95/base -> origin/gh/ColinPeppler/95/base 2025-10-10T00:32:35.7576255Z fb4e7c9951d..1e16bde6d22 gh/ColinPeppler/95/head -> origin/gh/ColinPeppler/95/head 2025-10-10T00:32:35.7584699Z + 8bbbd6ebf5f...af78c813294 gh/ColinPeppler/95/orig -> origin/gh/ColinPeppler/95/orig (forced update) 2025-10-10T00:32:35.7592175Z + 984f85b1d10...58298a3cde7 gh/H-Huang/221/orig -> origin/gh/H-Huang/221/orig (forced update) 2025-10-10T00:32:35.7597976Z * [new branch] gh/H-Huang/223/base -> origin/gh/H-Huang/223/base 2025-10-10T00:32:35.7600013Z * [new branch] gh/H-Huang/223/head -> origin/gh/H-Huang/223/head 2025-10-10T00:32:35.7602868Z * [new branch] gh/H-Huang/223/orig -> origin/gh/H-Huang/223/orig 2025-10-10T00:32:35.7617437Z 3dc53fa01ba..75598c7cf5f gh/PaulZhang12/30/base -> origin/gh/PaulZhang12/30/base 2025-10-10T00:32:35.7620616Z 2b29677512c..9abe2f304a2 gh/PaulZhang12/30/head -> origin/gh/PaulZhang12/30/head 2025-10-10T00:32:35.7625669Z + cd966f2e7fa...96dff917b4f gh/PaulZhang12/30/orig -> origin/gh/PaulZhang12/30/orig (forced update) 2025-10-10T00:32:35.7692363Z 02f5fc4c6b5..1ef33328641 gh/fduwjj/217/base -> origin/gh/fduwjj/217/base 2025-10-10T00:32:35.7694592Z 4cee188a328..f239d3e0702 gh/fduwjj/217/head -> origin/gh/fduwjj/217/head 2025-10-10T00:32:35.7699011Z + a71307edbd1...db641b4f26b gh/fduwjj/217/orig -> origin/gh/fduwjj/217/orig (forced update) 2025-10-10T00:32:35.7744421Z 2425b214cb8..84d853a0e2d gh/laithsakka/312/head -> origin/gh/laithsakka/312/head 2025-10-10T00:32:35.7748048Z + 3f78d71c231...fdc20895306 gh/laithsakka/312/orig -> origin/gh/laithsakka/312/orig (forced update) 2025-10-10T00:32:35.7752088Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-10-10T00:32:35.7754939Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-10-10T00:32:35.7757903Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-10-10T00:32:35.7819711Z bf42ce01c25..b0d879170f3 gh/zpcore/17/base -> origin/gh/zpcore/17/base 2025-10-10T00:32:35.7823456Z 4f19c3b80c0..21237b071ec gh/zpcore/17/head -> origin/gh/zpcore/17/head 2025-10-10T00:32:35.7827742Z + 5c271f68b56...6f40c111979 gh/zpcore/17/orig -> origin/gh/zpcore/17/orig (forced update) 2025-10-10T00:32:35.7832144Z 30084063047..7e786982295 gh/zpcore/18/base -> origin/gh/zpcore/18/base 2025-10-10T00:32:35.7837355Z 77ea658a3c1..a32c2563102 gh/zpcore/18/head -> origin/gh/zpcore/18/head 2025-10-10T00:32:35.7841658Z + 508275c37b6...09898f09901 gh/zpcore/18/orig -> origin/gh/zpcore/18/orig (forced update) 2025-10-10T00:32:35.7846008Z 92628301898..856a8d74b63 gh/zpcore/19/base -> origin/gh/zpcore/19/base 2025-10-10T00:32:35.7851570Z 5a0ac487ae8..ae808389c5c gh/zpcore/19/head -> origin/gh/zpcore/19/head 2025-10-10T00:32:35.7855062Z + 9b71be918d9...1e310298198 gh/zpcore/19/orig -> origin/gh/zpcore/19/orig (forced update) 2025-10-10T00:32:35.7857278Z f3bb4de7f4e..4791a3e0838 gh/zpcore/20/base -> origin/gh/zpcore/20/base 2025-10-10T00:32:35.7859458Z 48a3a841387..92e9ac9b7cf gh/zpcore/20/head -> origin/gh/zpcore/20/head 2025-10-10T00:32:35.7863974Z + f0d10d61616...d4414b75e1c gh/zpcore/20/orig -> origin/gh/zpcore/20/orig (forced update) 2025-10-10T00:32:35.7869228Z 48a3a841387..01807d80e68 gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-10-10T00:32:35.7873280Z 26e7302ce79..ddd7c4f4bf0 gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-10-10T00:32:35.7878172Z + 435e511ca99...ae40b921984 gh/zpcore/21/orig -> origin/gh/zpcore/21/orig (forced update) 2025-10-10T00:32:35.7882403Z * [new branch] huba/debug_mode -> origin/huba/debug_mode 2025-10-10T00:32:35.7890592Z 4c076209a64..ec8f076161c lucaskabela/fix_164823 -> origin/lucaskabela/fix_164823 2025-10-10T00:32:35.7894707Z 6d27a8e5093..a6fa4f9c283 main -> origin/main 2025-10-10T00:32:35.7905500Z * [new branch] pianpwk/debug_mode_inductor -> origin/pianpwk/debug_mode_inductor 2025-10-10T00:32:35.7912562Z 2499e30f153..573301c0b33 pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-10-10T00:32:35.7921942Z + 18fa378ec4f...cbb4538d23d ruisi/aot_eager_pass -> origin/ruisi/aot_eager_pass (forced update) 2025-10-10T00:32:35.7927106Z * [new branch] update-audio-commit-hash/18392707270-1874-1 -> origin/update-audio-commit-hash/18392707270-1874-1 2025-10-10T00:32:35.7929289Z + 8dc1bc54c08...db57fcc0608 update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 (forced update) 2025-10-10T00:32:35.7931325Z + 247f698a57a...b59a8e40dfd update-vision-commit-hash/18361653903-1869-1 -> origin/update-vision-commit-hash/18361653903-1869-1 (forced update) 2025-10-10T00:32:35.7934429Z + 406c9628437...0d1ff5e00b7 update-vllm-commit-hash/18236802781-1857-1 -> origin/update-vllm-commit-hash/18236802781-1857-1 (forced update) 2025-10-10T00:32:35.7937631Z + 9d070a4a349...26f67ef0506 windows_mmap -> origin/windows_mmap (forced update) 2025-10-10T00:32:35.7941166Z t [tag update] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-10-10T00:32:35.7943921Z t [tag update] ciflow/inductor/164144 -> ciflow/inductor/164144 2025-10-10T00:32:35.7945566Z * [new tag] ciflow/inductor/164324 -> ciflow/inductor/164324 2025-10-10T00:32:35.7946993Z t [tag update] ciflow/inductor/164526 -> ciflow/inductor/164526 2025-10-10T00:32:35.7948627Z t [tag update] ciflow/inductor/164628 -> ciflow/inductor/164628 2025-10-10T00:32:35.7949886Z * [new tag] ciflow/inductor/164691 -> ciflow/inductor/164691 2025-10-10T00:32:35.7951490Z t [tag update] ciflow/inductor/164806 -> ciflow/inductor/164806 2025-10-10T00:32:35.7953034Z t [tag update] ciflow/inductor/164808 -> ciflow/inductor/164808 2025-10-10T00:32:35.7954525Z t [tag update] ciflow/inductor/164820 -> ciflow/inductor/164820 2025-10-10T00:32:35.7955950Z t [tag update] ciflow/inductor/164821 -> ciflow/inductor/164821 2025-10-10T00:32:35.7957374Z t [tag update] ciflow/inductor/164873 -> ciflow/inductor/164873 2025-10-10T00:32:35.7958766Z t [tag update] ciflow/inductor/164902 -> ciflow/inductor/164902 2025-10-10T00:32:35.7960458Z t [tag update] ciflow/inductor/164984 -> ciflow/inductor/164984 2025-10-10T00:32:35.7961850Z t [tag update] ciflow/inductor/164992 -> ciflow/inductor/164992 2025-10-10T00:32:35.7963239Z t [tag update] ciflow/inductor/165017 -> ciflow/inductor/165017 2025-10-10T00:32:35.7965098Z * [new tag] ciflow/inductor/165030 -> ciflow/inductor/165030 2025-10-10T00:32:35.7965937Z t [tag update] ciflow/inductor/165063 -> ciflow/inductor/165063 2025-10-10T00:32:35.7967384Z t [tag update] ciflow/inductor/165091 -> ciflow/inductor/165091 2025-10-10T00:32:35.7968777Z t [tag update] ciflow/inductor/165092 -> ciflow/inductor/165092 2025-10-10T00:32:35.7969771Z * [new tag] ciflow/inductor/165106 -> ciflow/inductor/165106 2025-10-10T00:32:35.7970968Z * [new tag] ciflow/inductor/165107 -> ciflow/inductor/165107 2025-10-10T00:32:35.7972011Z * [new tag] ciflow/inductor/165112 -> ciflow/inductor/165112 2025-10-10T00:32:35.7973031Z * [new tag] ciflow/inductor/165113 -> ciflow/inductor/165113 2025-10-10T00:32:35.7975102Z * [new tag] ciflow/rocm-mi300/165026 -> ciflow/rocm-mi300/165026 2025-10-10T00:32:35.7976426Z t [tag update] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-10-10T00:32:35.7977630Z * [new tag] ciflow/rocm/165026 -> ciflow/rocm/165026 2025-10-10T00:32:35.7979306Z t [tag update] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-10-10T00:32:35.7980724Z t [tag update] ciflow/trunk/149536 -> ciflow/trunk/149536 2025-10-10T00:32:35.7981819Z * [new tag] ciflow/trunk/154279 -> ciflow/trunk/154279 2025-10-10T00:32:35.7983754Z t [tag update] ciflow/trunk/164144 -> ciflow/trunk/164144 2025-10-10T00:32:35.7985292Z t [tag update] ciflow/trunk/164510 -> ciflow/trunk/164510 2025-10-10T00:32:35.7986758Z t [tag update] ciflow/trunk/164628 -> ciflow/trunk/164628 2025-10-10T00:32:35.7987844Z * [new tag] ciflow/trunk/164653 -> ciflow/trunk/164653 2025-10-10T00:32:35.7988793Z * [new tag] ciflow/trunk/164691 -> ciflow/trunk/164691 2025-10-10T00:32:35.7990083Z t [tag update] ciflow/trunk/164808 -> ciflow/trunk/164808 2025-10-10T00:32:35.7991782Z t [tag update] ciflow/trunk/165017 -> ciflow/trunk/165017 2025-10-10T00:32:35.7992570Z * [new tag] ciflow/trunk/165033 -> ciflow/trunk/165033 2025-10-10T00:32:35.7993548Z * [new tag] ciflow/trunk/165047 -> ciflow/trunk/165047 2025-10-10T00:32:35.7994673Z * [new tag] ciflow/trunk/165060 -> ciflow/trunk/165060 2025-10-10T00:32:35.7996296Z t [tag update] ciflow/trunk/165090 -> ciflow/trunk/165090 2025-10-10T00:32:35.7997777Z t [tag update] ciflow/trunk/165094 -> ciflow/trunk/165094 2025-10-10T00:32:35.7998865Z * [new tag] ciflow/trunk/165113 -> ciflow/trunk/165113 2025-10-10T00:32:35.8000134Z t [tag update] ciflow/vllm/164628 -> ciflow/vllm/164628 2025-10-10T00:32:35.8001204Z * [new tag] ciflow/xpu/162454 -> ciflow/xpu/162454 2025-10-10T00:32:35.8004994Z * [new tag] trunk/344e6365a0068c2d2847fcec0c55dd53291d475e -> trunk/344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:35.8006438Z * [new tag] trunk/34ac9b61cbfcf17328ccb8b729509829447fdddd -> trunk/34ac9b61cbfcf17328ccb8b729509829447fdddd 2025-10-10T00:32:35.8007840Z * [new tag] trunk/47956196d99166fe9083beb2a52fd2e6c90b2011 -> trunk/47956196d99166fe9083beb2a52fd2e6c90b2011 2025-10-10T00:32:35.8009269Z * [new tag] trunk/4a0df39f814afad087e8b29dd2914a8b54567694 -> trunk/4a0df39f814afad087e8b29dd2914a8b54567694 2025-10-10T00:32:35.8010745Z * [new tag] trunk/600db525bdb5e76c12f30f271d969d43a7f8efef -> trunk/600db525bdb5e76c12f30f271d969d43a7f8efef 2025-10-10T00:32:35.8012175Z * [new tag] trunk/9aa92f246fa5fe5cfda17970d41d167b19a0612a -> trunk/9aa92f246fa5fe5cfda17970d41d167b19a0612a 2025-10-10T00:32:35.8014102Z * [new tag] trunk/a57a14868dcfd9dabf9bd19b6b11f31967c80c87 -> trunk/a57a14868dcfd9dabf9bd19b6b11f31967c80c87 2025-10-10T00:32:35.8015652Z * [new tag] trunk/a6fa4f9c283971c0fb6f60a89674a1f35370ac79 -> trunk/a6fa4f9c283971c0fb6f60a89674a1f35370ac79 2025-10-10T00:32:35.8017035Z * [new tag] trunk/f6de195616432f42a545b98ea41cc816019d1c60 -> trunk/f6de195616432f42a545b98ea41cc816019d1c60 2025-10-10T00:32:35.9189374Z [command]/usr/bin/git rev-parse --verify --quiet 344e6365a0068c2d2847fcec0c55dd53291d475e^{object} 2025-10-10T00:32:35.9300208Z 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:35.9314231Z ##[endgroup] 2025-10-10T00:32:35.9314986Z ##[group]Determining the checkout info 2025-10-10T00:32:35.9315821Z ##[endgroup] 2025-10-10T00:32:35.9324267Z [command]/usr/bin/git sparse-checkout disable 2025-10-10T00:32:35.9758382Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-10-10T00:32:35.9827952Z ##[group]Checking out the ref 2025-10-10T00:32:35.9836230Z [command]/usr/bin/git checkout --progress --force 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:36.2245956Z Previous HEAD position was 4d7f9f3aed6 Revert "[ATen] Fix CUDA reduction warp shuffle order (#164790)" 2025-10-10T00:32:36.2265669Z HEAD is now at 344e6365a00 [inductor][eazy] change how torch.use_deterministic_algorithms affect inductor (#164905) 2025-10-10T00:32:36.2371741Z ##[endgroup] 2025-10-10T00:32:36.2372543Z ##[group]Setting up auth for fetching submodules 2025-10-10T00:32:36.2383392Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-10-10T00:32:36.2472244Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-10-10T00:32:36.2550964Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-10-10T00:32:36.2624861Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-10-10T00:32:36.2693644Z ##[endgroup] 2025-10-10T00:32:36.2694435Z ##[group]Fetching submodules 2025-10-10T00:32:36.2700295Z [command]/usr/bin/git submodule sync --recursive 2025-10-10T00:32:36.3410142Z Synchronizing submodule url for 'android/libs/fbjni' 2025-10-10T00:32:36.3516638Z Synchronizing submodule url for 'third_party/FP16' 2025-10-10T00:32:36.3623846Z Synchronizing submodule url for 'third_party/FXdiv' 2025-10-10T00:32:36.3730182Z Synchronizing submodule url for 'third_party/NNPACK' 2025-10-10T00:32:36.3835967Z Synchronizing submodule url for 'third_party/NVTX' 2025-10-10T00:32:36.3943837Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:36.4049487Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-10-10T00:32:36.4189852Z Synchronizing submodule url for 'third_party/aiter' 2025-10-10T00:32:36.4288834Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:36.4412688Z Synchronizing submodule url for 'third_party/benchmark' 2025-10-10T00:32:36.4522407Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-10-10T00:32:36.4645775Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-10-10T00:32:36.4748875Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-10-10T00:32:36.4856287Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-10-10T00:32:36.4958902Z Synchronizing submodule url for 'third_party/cutlass' 2025-10-10T00:32:36.5087392Z Synchronizing submodule url for 'third_party/fbgemm' 2025-10-10T00:32:36.5190282Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:36.5286797Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:36.5401309Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:36.5499031Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:36.5614339Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:36.5713283Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:36.5807779Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-10-10T00:32:36.5921021Z Synchronizing submodule url for 'third_party/flash-attention' 2025-10-10T00:32:36.6015788Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:36.6123257Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:36.6248210Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-10-10T00:32:36.6361730Z Synchronizing submodule url for 'third_party/fmt' 2025-10-10T00:32:36.6465142Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:36.6572619Z Synchronizing submodule url for 'third_party/gloo' 2025-10-10T00:32:36.6679841Z Synchronizing submodule url for 'third_party/googletest' 2025-10-10T00:32:36.6788556Z Synchronizing submodule url for 'third_party/ideep' 2025-10-10T00:32:36.6889951Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:36.7017742Z Synchronizing submodule url for 'third_party/ittapi' 2025-10-10T00:32:36.7124998Z Synchronizing submodule url for 'third_party/kineto' 2025-10-10T00:32:36.7226718Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:36.7317999Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:36.7417322Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:36.7519201Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:36.7615228Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:36.7704561Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:36.7812840Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:36.7910911Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:36.8006164Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:36.8108334Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:36.8205371Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:36.8295946Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:36.8400744Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:36.8507301Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:36.8603971Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:36.8713296Z Synchronizing submodule url for 'third_party/kleidiai' 2025-10-10T00:32:36.8827392Z Synchronizing submodule url for 'third_party/mimalloc' 2025-10-10T00:32:36.8935144Z Synchronizing submodule url for 'third_party/nlohmann' 2025-10-10T00:32:36.9048268Z Synchronizing submodule url for 'third_party/onnx' 2025-10-10T00:32:36.9184681Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:36.9297155Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-10-10T00:32:36.9403295Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:36.9497892Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:36.9593461Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:36.9691481Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:36.9786369Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:36.9881106Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:36.9975685Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:37.0063309Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:37.0162234Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:37.0265384Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:37.0413778Z Synchronizing submodule url for 'third_party/pocketfft' 2025-10-10T00:32:37.0520506Z Synchronizing submodule url for 'third_party/protobuf' 2025-10-10T00:32:37.0625482Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:37.0723059Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:37.0824068Z Synchronizing submodule url for 'third_party/psimd' 2025-10-10T00:32:37.0933008Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-10-10T00:32:37.1040438Z Synchronizing submodule url for 'third_party/pybind11' 2025-10-10T00:32:37.1147775Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-10-10T00:32:37.1254450Z Synchronizing submodule url for 'third_party/sleef' 2025-10-10T00:32:37.1361563Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-10-10T00:32:37.1458083Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:37.1551907Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:37.1643631Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:37.1739033Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:37.1826910Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:37.1976074Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-10-10T00:32:37.3269935Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-10-10T00:32:37.3892422Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-10-10T00:32:37.4508856Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-10-10T00:32:37.5150342Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-10-10T00:32:37.5830199Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-10-10T00:32:37.6487013Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-10-10T00:32:37.7517415Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-10-10T00:32:37.8424038Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-10-10T00:32:37.9396304Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-10-10T00:32:38.0115492Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-10-10T00:32:38.1167165Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-10-10T00:32:38.1883483Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-10-10T00:32:38.2543472Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-10-10T00:32:38.3264999Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-10-10T00:32:38.4145669Z Submodule path 'third_party/cutlass': checked out 'f3fde58372d33e9a5650ba7b80fc48b3b49d40c8' 2025-10-10T00:32:38.5070968Z Submodule path 'third_party/fbgemm': checked out '3cefe0564a8c3de514a152d40a2b4770f2ee5be0' 2025-10-10T00:32:38.5649147Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-10-10T00:32:38.6407322Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-10-10T00:32:38.7053260Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-10-10T00:32:38.7820339Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-10-10T00:32:38.8463126Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:32:38.9041773Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-10-10T00:32:38.9728483Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-10-10T00:32:39.0468572Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-10-10T00:32:39.1406563Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-10-10T00:32:39.2204006Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-10-10T00:32:39.3007991Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-10-10T00:32:39.3694194Z Submodule path 'third_party/fmt': checked out 'e424e3f2e607da02742f73db84873b8084fc714c' 2025-10-10T00:32:39.4345584Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-10-10T00:32:39.5006916Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-10-10T00:32:39.5658638Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:32:39.6372261Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-10-10T00:32:39.7295560Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-10-10T00:32:39.7984666Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-10-10T00:32:39.8688226Z Submodule path 'third_party/kineto': checked out '001ba8eb519438592f79dbc8e86a349f5f6c6829' 2025-10-10T00:32:39.9359353Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-10-10T00:32:39.9995105Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-10-10T00:32:40.0616826Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-10-10T00:32:40.1239582Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-10-10T00:32:40.1855745Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-10-10T00:32:40.2431236Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-10-10T00:32:40.3045674Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-10-10T00:32:40.3683475Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:32:40.4391210Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-10-10T00:32:40.4976558Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-10-10T00:32:40.5609761Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-10-10T00:32:40.6254438Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-10-10T00:32:40.6881556Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-10-10T00:32:40.7513083Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-10-10T00:32:40.8115763Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:32:40.8818000Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-10-10T00:32:40.9514759Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-10-10T00:32:41.0278194Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-10-10T00:32:41.1323759Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-10-10T00:32:41.2064325Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-10-10T00:32:41.2859517Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-10-10T00:32:41.3445307Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-10-10T00:32:41.4044729Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-10-10T00:32:41.4617379Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-10-10T00:32:41.5312514Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-10-10T00:32:41.5908979Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-10-10T00:32:41.6466326Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-10-10T00:32:41.7077054Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-10-10T00:32:41.7721810Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-10-10T00:32:41.8335028Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-10-10T00:32:41.9316849Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-10-10T00:32:42.0041303Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-10-10T00:32:42.1083764Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-10-10T00:32:42.1675357Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-10-10T00:32:42.2293813Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-10-10T00:32:42.2924866Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-10-10T00:32:42.3540615Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-10-10T00:32:42.4248701Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-10-10T00:32:42.4882124Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-10-10T00:32:42.5520545Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-10-10T00:32:42.6202022Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-10-10T00:32:42.6777840Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-10-10T00:32:42.7342761Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-10-10T00:32:42.8262338Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-10-10T00:32:42.8904351Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-10-10T00:32:42.9449413Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-10-10T00:32:42.9687979Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-10-10T00:32:43.0379046Z Entering 'android/libs/fbjni' 2025-10-10T00:32:43.0492660Z Entering 'third_party/FP16' 2025-10-10T00:32:43.0604277Z Entering 'third_party/FXdiv' 2025-10-10T00:32:43.0715447Z Entering 'third_party/NNPACK' 2025-10-10T00:32:43.0833541Z Entering 'third_party/NVTX' 2025-10-10T00:32:43.0945575Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:43.1056917Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:43.1199206Z Entering 'third_party/aiter' 2025-10-10T00:32:43.1308695Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:43.1435931Z Entering 'third_party/benchmark' 2025-10-10T00:32:43.1545522Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:43.1672722Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:43.1782955Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:43.1894445Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:43.2004877Z Entering 'third_party/cutlass' 2025-10-10T00:32:43.2136299Z Entering 'third_party/fbgemm' 2025-10-10T00:32:43.2252383Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:43.2355738Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:43.2477258Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:43.2581576Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:43.2704841Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:43.2808659Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:43.2908788Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:43.3021754Z Entering 'third_party/flash-attention' 2025-10-10T00:32:43.3132229Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:43.3251969Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:43.3379555Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:43.3494713Z Entering 'third_party/fmt' 2025-10-10T00:32:43.3605468Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:43.3714665Z Entering 'third_party/gloo' 2025-10-10T00:32:43.3824998Z Entering 'third_party/googletest' 2025-10-10T00:32:43.3935481Z Entering 'third_party/ideep' 2025-10-10T00:32:43.4041217Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:43.4161691Z Entering 'third_party/ittapi' 2025-10-10T00:32:43.4271651Z Entering 'third_party/kineto' 2025-10-10T00:32:43.4376911Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:43.4475018Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:43.4580989Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:43.4681735Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:43.4784403Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:43.4881057Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:43.4988881Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:43.5089545Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:43.5190470Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:43.5292671Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:43.5390330Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:43.5488202Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:43.5592145Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:43.5709336Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:43.5812226Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:43.5921322Z Entering 'third_party/kleidiai' 2025-10-10T00:32:43.6032154Z Entering 'third_party/mimalloc' 2025-10-10T00:32:43.6140691Z Entering 'third_party/nlohmann' 2025-10-10T00:32:43.6255671Z Entering 'third_party/onnx' 2025-10-10T00:32:43.6398766Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:43.6515260Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:43.6623910Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:43.6726511Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:43.6825577Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:43.6923746Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:43.7027680Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:43.7127981Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:43.7226911Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:43.7321803Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:43.7426502Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:43.7535101Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:43.7682621Z Entering 'third_party/pocketfft' 2025-10-10T00:32:43.7791041Z Entering 'third_party/protobuf' 2025-10-10T00:32:43.7904289Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:43.8004343Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:43.8113653Z Entering 'third_party/psimd' 2025-10-10T00:32:43.8221635Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:43.8328914Z Entering 'third_party/pybind11' 2025-10-10T00:32:43.8439853Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:43.8548048Z Entering 'third_party/sleef' 2025-10-10T00:32:43.8658266Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:43.8764295Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:43.8863565Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:43.8963344Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:43.9063418Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:43.9158069Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:43.9301676Z ##[endgroup] 2025-10-10T00:32:43.9302450Z ##[group]Persisting credentials for submodules 2025-10-10T00:32:43.9316366Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-10-10T00:32:43.9982434Z Entering 'android/libs/fbjni' 2025-10-10T00:32:44.0048853Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0049509Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0145147Z Entering 'third_party/FP16' 2025-10-10T00:32:44.0211513Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0212176Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0306577Z Entering 'third_party/FXdiv' 2025-10-10T00:32:44.0372454Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0373106Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0466602Z Entering 'third_party/NNPACK' 2025-10-10T00:32:44.0532315Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0532975Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0624625Z Entering 'third_party/NVTX' 2025-10-10T00:32:44.0690563Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0691211Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0791024Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:44.0856283Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0856944Z url.https://github.com/.insteadof 2025-10-10T00:32:44.0948309Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:44.1013077Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1013727Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1141316Z Entering 'third_party/aiter' 2025-10-10T00:32:44.1206123Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1213158Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1301976Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:44.1362534Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1363175Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1478879Z Entering 'third_party/benchmark' 2025-10-10T00:32:44.1544763Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1545441Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1638574Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:44.1702300Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1702978Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1817324Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:44.1881576Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1882587Z url.https://github.com/.insteadof 2025-10-10T00:32:44.1974087Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:44.2039104Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2039759Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2133760Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:44.2197604Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2198296Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2290190Z Entering 'third_party/cutlass' 2025-10-10T00:32:44.2352643Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2353305Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2466668Z Entering 'third_party/fbgemm' 2025-10-10T00:32:44.2529923Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2530574Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2624701Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:44.2685504Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2686149Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2775058Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:44.2834788Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2835442Z url.https://github.com/.insteadof 2025-10-10T00:32:44.2943896Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:44.3002055Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3002708Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3091757Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:44.3149690Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3150336Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3258988Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:44.3317456Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3318107Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3405116Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:44.3464747Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3465396Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3553427Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:44.3613293Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3613940Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3711888Z Entering 'third_party/flash-attention' 2025-10-10T00:32:44.3776520Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3777179Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3869939Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:44.3929239Z url.https://github.com/.insteadof 2025-10-10T00:32:44.3929881Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4034035Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:44.4092988Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4093631Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4206250Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:44.4270838Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4271499Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4369749Z Entering 'third_party/fmt' 2025-10-10T00:32:44.4433620Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4434409Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4525494Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:44.4589558Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4590236Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4682535Z Entering 'third_party/gloo' 2025-10-10T00:32:44.4746876Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4747539Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4839518Z Entering 'third_party/googletest' 2025-10-10T00:32:44.4903261Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4903917Z url.https://github.com/.insteadof 2025-10-10T00:32:44.4998608Z Entering 'third_party/ideep' 2025-10-10T00:32:44.5061993Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5062671Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5148015Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:44.5207420Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5208077Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5318682Z Entering 'third_party/ittapi' 2025-10-10T00:32:44.5380878Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5381539Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5473482Z Entering 'third_party/kineto' 2025-10-10T00:32:44.5537572Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5538257Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5627323Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:44.5687480Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5688157Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5776059Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:44.5834915Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5835562Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5930574Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:44.5991922Z url.https://github.com/.insteadof 2025-10-10T00:32:44.5992584Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6089162Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:44.6150737Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6151412Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6241515Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:44.6302526Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6303192Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6391190Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:44.6454788Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6455452Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6555113Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:44.6617036Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6617693Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6713398Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:44.6774615Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6775272Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6867728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:44.6929612Z url.https://github.com/.insteadof 2025-10-10T00:32:44.6930282Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7025211Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:44.7087441Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7088111Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7179004Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:44.7239703Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7240349Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7327993Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:44.7389994Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7390646Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7485563Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:44.7544663Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7545347Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7649555Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:44.7708808Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7709478Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7799935Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:44.7862025Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7862670Z url.https://github.com/.insteadof 2025-10-10T00:32:44.7961608Z Entering 'third_party/kleidiai' 2025-10-10T00:32:44.8026533Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8027182Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8121765Z Entering 'third_party/mimalloc' 2025-10-10T00:32:44.8185283Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8185922Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8279469Z Entering 'third_party/nlohmann' 2025-10-10T00:32:44.8343666Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8344341Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8440466Z Entering 'third_party/onnx' 2025-10-10T00:32:44.8504481Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8505123Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8633532Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:44.8696680Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8697344Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8801549Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:44.8866425Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8867101Z url.https://github.com/.insteadof 2025-10-10T00:32:44.8960930Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:44.9020945Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9021617Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9112447Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:44.9171484Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9172188Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9262279Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:44.9323250Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9323925Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9414282Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:44.9473343Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9474054Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9568498Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:44.9628392Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9629048Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9719155Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:44.9780651Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9781309Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9873369Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:44.9935295Z url.https://github.com/.insteadof 2025-10-10T00:32:44.9935952Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0023999Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:45.0084947Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0085622Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0182694Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:45.0243557Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0244229Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0342060Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:45.0403317Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0403977Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0541603Z Entering 'third_party/pocketfft' 2025-10-10T00:32:45.0606502Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0607176Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0700608Z Entering 'third_party/protobuf' 2025-10-10T00:32:45.0764894Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0765557Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0862932Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:45.0922857Z url.https://github.com/.insteadof 2025-10-10T00:32:45.0923535Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1015812Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:45.1075464Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1076145Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1173893Z Entering 'third_party/psimd' 2025-10-10T00:32:45.1237313Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1237971Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1330850Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:45.1394485Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1395140Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1488620Z Entering 'third_party/pybind11' 2025-10-10T00:32:45.1552094Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1552735Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1645614Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:45.1709348Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1710030Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1802938Z Entering 'third_party/sleef' 2025-10-10T00:32:45.1867604Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1868281Z url.https://github.com/.insteadof 2025-10-10T00:32:45.1960298Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:45.2024535Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2025196Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2114799Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:45.2173683Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2174348Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2260889Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:45.2319815Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2320511Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2408954Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:45.2467045Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2468236Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2557244Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:45.2615273Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2615930Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2699457Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:45.2758524Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2759181Z url.https://github.com/.insteadof 2025-10-10T00:32:45.2899865Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-10-10T00:32:45.3600124Z Entering 'android/libs/fbjni' 2025-10-10T00:32:45.3701264Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-10-10T00:32:45.3762907Z Entering 'third_party/FP16' 2025-10-10T00:32:45.3865841Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-10-10T00:32:45.3923898Z Entering 'third_party/FXdiv' 2025-10-10T00:32:45.4027335Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-10-10T00:32:45.4083503Z Entering 'third_party/NNPACK' 2025-10-10T00:32:45.4183867Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-10-10T00:32:45.4241767Z Entering 'third_party/NVTX' 2025-10-10T00:32:45.4342516Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-10-10T00:32:45.4400736Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:45.4499048Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-10-10T00:32:45.4557266Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:45.4656353Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-10-10T00:32:45.4747114Z Entering 'third_party/aiter' 2025-10-10T00:32:45.4845840Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-10-10T00:32:45.4901240Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:45.4998231Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-10-10T00:32:45.5075992Z Entering 'third_party/benchmark' 2025-10-10T00:32:45.5176360Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:32:45.5234010Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:45.5332812Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-10-10T00:32:45.5410425Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:45.5508832Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-10-10T00:32:45.5564758Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:45.5660339Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-10-10T00:32:45.5717495Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:45.5814807Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-10-10T00:32:45.5870673Z Entering 'third_party/cutlass' 2025-10-10T00:32:45.5969918Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-10-10T00:32:45.6048158Z Entering 'third_party/fbgemm' 2025-10-10T00:32:45.6153422Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-10-10T00:32:45.6213789Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:45.6311306Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-10-10T00:32:45.6367072Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:45.6465943Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-10-10T00:32:45.6540369Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:45.6639516Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-10-10T00:32:45.6694358Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:45.6790002Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-10-10T00:32:45.6864381Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:45.6963407Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-10-10T00:32:45.7017693Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:45.7115244Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-10-10T00:32:45.7169146Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:45.7263664Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-10-10T00:32:45.7327576Z Entering 'third_party/flash-attention' 2025-10-10T00:32:45.7428238Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-10-10T00:32:45.7485481Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:45.7583240Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-10-10T00:32:45.7648648Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:45.7743664Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-10-10T00:32:45.7821127Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:45.7919045Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-10-10T00:32:45.7983714Z Entering 'third_party/fmt' 2025-10-10T00:32:45.8080014Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-10-10T00:32:45.8138171Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:45.8239144Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-10-10T00:32:45.8295917Z Entering 'third_party/gloo' 2025-10-10T00:32:45.8393974Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-10-10T00:32:45.8451868Z Entering 'third_party/googletest' 2025-10-10T00:32:45.8552621Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:32:45.8611422Z Entering 'third_party/ideep' 2025-10-10T00:32:45.8711393Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-10-10T00:32:45.8764298Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:45.8863130Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-10-10T00:32:45.8943585Z Entering 'third_party/ittapi' 2025-10-10T00:32:45.9043109Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-10-10T00:32:45.9099295Z Entering 'third_party/kineto' 2025-10-10T00:32:45.9198584Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-10-10T00:32:45.9253362Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:45.9348227Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-10-10T00:32:45.9397938Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:45.9491565Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-10-10T00:32:45.9546778Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:45.9644156Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-10-10T00:32:45.9697403Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:45.9797731Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-10-10T00:32:45.9852970Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:45.9949925Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-10-10T00:32:45.9999371Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:46.0097925Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-10-10T00:32:46.0155852Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:46.0256973Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-10-10T00:32:46.0310846Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:46.0406775Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:32:46.0462281Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:46.0561912Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-10-10T00:32:46.0618763Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:46.0715593Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-10-10T00:32:46.0769541Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:46.0867471Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-10-10T00:32:46.0916087Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:46.1013717Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-10-10T00:32:46.1071579Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:46.1171098Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-10-10T00:32:46.1237604Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:46.1333391Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-10-10T00:32:46.1387642Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:46.1483898Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-10-10T00:32:46.1544092Z Entering 'third_party/kleidiai' 2025-10-10T00:32:46.1644228Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-10-10T00:32:46.1703536Z Entering 'third_party/mimalloc' 2025-10-10T00:32:46.1802787Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-10-10T00:32:46.1860902Z Entering 'third_party/nlohmann' 2025-10-10T00:32:46.1961388Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-10-10T00:32:46.2023188Z Entering 'third_party/onnx' 2025-10-10T00:32:46.2123118Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-10-10T00:32:46.2213477Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:46.2313558Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:32:46.2380106Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:46.2485466Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-10-10T00:32:46.2544129Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:46.2639920Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:32:46.2694941Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:46.2791345Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:32:46.2845803Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:46.2942371Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-10-10T00:32:46.2995914Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:46.3093519Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-10-10T00:32:46.3150897Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:46.3246842Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-10-10T00:32:46.3299518Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:46.3395069Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-10-10T00:32:46.3448836Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:46.3546365Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-10-10T00:32:46.3594402Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:46.3691976Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-10-10T00:32:46.3748111Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:46.3845920Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-10-10T00:32:46.3905335Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:46.4001043Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-10-10T00:32:46.4103751Z Entering 'third_party/pocketfft' 2025-10-10T00:32:46.4202858Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-10-10T00:32:46.4259542Z Entering 'third_party/protobuf' 2025-10-10T00:32:46.4358077Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-10-10T00:32:46.4417492Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:46.4523357Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:32:46.4577974Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:46.4676823Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:32:46.4737559Z Entering 'third_party/psimd' 2025-10-10T00:32:46.4839408Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-10-10T00:32:46.4895261Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:46.4995344Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-10-10T00:32:46.5049956Z Entering 'third_party/pybind11' 2025-10-10T00:32:46.5151023Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:32:46.5212715Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:46.5315174Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-10-10T00:32:46.5372356Z Entering 'third_party/sleef' 2025-10-10T00:32:46.5472331Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-10-10T00:32:46.5528671Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:46.5627020Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-10-10T00:32:46.5682082Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:46.5778728Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:32:46.5831673Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:46.5931334Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-10-10T00:32:46.5984240Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:46.6081459Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-10-10T00:32:46.6135122Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:46.6233147Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:32:46.6280928Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:46.6376405Z file:/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-10-10T00:32:46.7006971Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-10-10T00:32:46.7699466Z Entering 'android/libs/fbjni' 2025-10-10T00:32:46.7813391Z Entering 'third_party/FP16' 2025-10-10T00:32:46.7933399Z Entering 'third_party/FXdiv' 2025-10-10T00:32:46.8048452Z Entering 'third_party/NNPACK' 2025-10-10T00:32:46.8159538Z Entering 'third_party/NVTX' 2025-10-10T00:32:46.8273603Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:46.8383604Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:46.8525257Z Entering 'third_party/aiter' 2025-10-10T00:32:46.8630287Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:46.8756767Z Entering 'third_party/benchmark' 2025-10-10T00:32:46.8865601Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:46.8994263Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:46.9100889Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:46.9209504Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:46.9315180Z Entering 'third_party/cutlass' 2025-10-10T00:32:46.9447799Z Entering 'third_party/fbgemm' 2025-10-10T00:32:46.9561105Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:46.9664918Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:46.9784912Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:46.9886294Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:47.0009201Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:47.0110351Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:47.0207019Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:47.0324094Z Entering 'third_party/flash-attention' 2025-10-10T00:32:47.0440852Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:47.0560698Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:47.0686379Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:47.0806385Z Entering 'third_party/fmt' 2025-10-10T00:32:47.0914990Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:47.1026241Z Entering 'third_party/gloo' 2025-10-10T00:32:47.1137188Z Entering 'third_party/googletest' 2025-10-10T00:32:47.1246810Z Entering 'third_party/ideep' 2025-10-10T00:32:47.1351461Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:47.1477398Z Entering 'third_party/ittapi' 2025-10-10T00:32:47.1586637Z Entering 'third_party/kineto' 2025-10-10T00:32:47.1691829Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:47.1792108Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:47.1898027Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:47.2000825Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:47.2104835Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:47.2205495Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:47.2317318Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:47.2419849Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:47.2521156Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:47.2624167Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:47.2725695Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:47.2823390Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:47.2932074Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:47.3051247Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:47.3149276Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:47.3257397Z Entering 'third_party/kleidiai' 2025-10-10T00:32:47.3367532Z Entering 'third_party/mimalloc' 2025-10-10T00:32:47.3478154Z Entering 'third_party/nlohmann' 2025-10-10T00:32:47.3588976Z Entering 'third_party/onnx' 2025-10-10T00:32:47.3732520Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:47.3849785Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:47.3959729Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:47.4059394Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:47.4161344Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:47.4262344Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:47.4365898Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:47.4465353Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:47.4566430Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:47.4663997Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:47.4771374Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:47.4878953Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:47.5024806Z Entering 'third_party/pocketfft' 2025-10-10T00:32:47.5132113Z Entering 'third_party/protobuf' 2025-10-10T00:32:47.5244477Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:47.5344766Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:47.5454816Z Entering 'third_party/psimd' 2025-10-10T00:32:47.5564758Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:47.5672171Z Entering 'third_party/pybind11' 2025-10-10T00:32:47.5782011Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:47.5892042Z Entering 'third_party/sleef' 2025-10-10T00:32:47.5998172Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:47.6103460Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:47.6204736Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:47.6305110Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:47.6407525Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:47.6500814Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:47.6651114Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-10-10T00:32:47.7327529Z Entering 'android/libs/fbjni' 2025-10-10T00:32:47.7436305Z Entering 'third_party/FP16' 2025-10-10T00:32:47.7545464Z Entering 'third_party/FXdiv' 2025-10-10T00:32:47.7652214Z Entering 'third_party/NNPACK' 2025-10-10T00:32:47.7763158Z Entering 'third_party/NVTX' 2025-10-10T00:32:47.7877516Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:47.7987485Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:47.8130265Z Entering 'third_party/aiter' 2025-10-10T00:32:47.8245428Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:47.8372834Z Entering 'third_party/benchmark' 2025-10-10T00:32:47.8484324Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:47.8611583Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:47.8721176Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:47.8829578Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:47.8936718Z Entering 'third_party/cutlass' 2025-10-10T00:32:47.9067501Z Entering 'third_party/fbgemm' 2025-10-10T00:32:47.9180718Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:47.9282387Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:47.9405046Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:47.9506409Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:47.9628474Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:47.9733157Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:47.9834483Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:47.9946747Z Entering 'third_party/flash-attention' 2025-10-10T00:32:48.0055697Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:48.0168421Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:48.0294944Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:48.0407248Z Entering 'third_party/fmt' 2025-10-10T00:32:48.0515333Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:48.0623085Z Entering 'third_party/gloo' 2025-10-10T00:32:48.0733816Z Entering 'third_party/googletest' 2025-10-10T00:32:48.0838776Z Entering 'third_party/ideep' 2025-10-10T00:32:48.0942714Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:48.1061594Z Entering 'third_party/ittapi' 2025-10-10T00:32:48.1168822Z Entering 'third_party/kineto' 2025-10-10T00:32:48.1272187Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:48.1365637Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:48.1467508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:48.1567988Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:48.1667006Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:48.1763523Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:48.1868871Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:48.1968417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:48.2071267Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:48.2173227Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:48.2272336Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:48.2368082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:48.2474324Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:48.2589382Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:48.2689447Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:48.2797131Z Entering 'third_party/kleidiai' 2025-10-10T00:32:48.2905281Z Entering 'third_party/mimalloc' 2025-10-10T00:32:48.3015708Z Entering 'third_party/nlohmann' 2025-10-10T00:32:48.3125838Z Entering 'third_party/onnx' 2025-10-10T00:32:48.3271451Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:48.3387500Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:48.3495936Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:48.3594854Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:48.3694256Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:48.3792639Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:48.3898165Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:48.3994485Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:48.4090200Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:48.4183448Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:48.4284075Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:48.4386029Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:48.4531914Z Entering 'third_party/pocketfft' 2025-10-10T00:32:48.4636995Z Entering 'third_party/protobuf' 2025-10-10T00:32:48.4745583Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:48.4847344Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:48.4954948Z Entering 'third_party/psimd' 2025-10-10T00:32:48.5061805Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:48.5169327Z Entering 'third_party/pybind11' 2025-10-10T00:32:48.5276900Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:48.5385809Z Entering 'third_party/sleef' 2025-10-10T00:32:48.5491813Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:48.5595345Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:48.5692810Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:48.5789585Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:48.5887444Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:48.5979454Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:48.6119203Z ##[endgroup] 2025-10-10T00:32:48.6252853Z [command]/usr/bin/git log -1 --format=%H 2025-10-10T00:32:48.6351719Z 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:48.6582984Z ##[group]Run cd "${GITHUB_WORKSPACE}" 2025-10-10T00:32:48.6583610Z cd "${GITHUB_WORKSPACE}" 2025-10-10T00:32:48.6584157Z # Clean stale submodule dirs 2025-10-10T00:32:48.6584682Z if [ -z "${NO_SUDO}" ]; then 2025-10-10T00:32:48.6585308Z  sudo git submodule foreach --recursive git clean -ffdx 2025-10-10T00:32:48.6585938Z else 2025-10-10T00:32:48.6586434Z  git submodule foreach --recursive git clean -ffdx 2025-10-10T00:32:48.6587016Z fi 2025-10-10T00:32:48.6642330Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:48.6642989Z env: 2025-10-10T00:32:48.6643344Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:48.6643787Z NO_SUDO: true 2025-10-10T00:32:48.6644172Z ##[endgroup] 2025-10-10T00:32:48.7388784Z Entering 'android/libs/fbjni' 2025-10-10T00:32:48.7485276Z Entering 'third_party/FP16' 2025-10-10T00:32:48.7576750Z Entering 'third_party/FXdiv' 2025-10-10T00:32:48.7667452Z Entering 'third_party/NNPACK' 2025-10-10T00:32:48.7768935Z Entering 'third_party/NVTX' 2025-10-10T00:32:48.7883219Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:32:48.7980117Z Entering 'third_party/XNNPACK' 2025-10-10T00:32:48.8366831Z Entering 'third_party/aiter' 2025-10-10T00:32:48.8492741Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:32:48.8790159Z Entering 'third_party/benchmark' 2025-10-10T00:32:48.8890172Z Entering 'third_party/composable_kernel' 2025-10-10T00:32:48.9227252Z Entering 'third_party/cpp-httplib' 2025-10-10T00:32:48.9328549Z Entering 'third_party/cpuinfo' 2025-10-10T00:32:48.9444191Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:32:48.9548019Z Entering 'third_party/cutlass' 2025-10-10T00:32:48.9869731Z Entering 'third_party/fbgemm' 2025-10-10T00:32:49.0057084Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:32:49.0145554Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:32:49.0469162Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:32:49.0562672Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:32:49.0862686Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:32:49.0949165Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:32:49.1027624Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:32:49.1146759Z Entering 'third_party/flash-attention' 2025-10-10T00:32:49.1255970Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:32:49.1530959Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:32:49.1784687Z Entering 'third_party/flatbuffers' 2025-10-10T00:32:49.1981283Z Entering 'third_party/fmt' 2025-10-10T00:32:49.2079407Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:32:49.2172450Z Entering 'third_party/gloo' 2025-10-10T00:32:49.2273102Z Entering 'third_party/googletest' 2025-10-10T00:32:49.2376484Z Entering 'third_party/ideep' 2025-10-10T00:32:49.2462060Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:32:49.2715906Z Entering 'third_party/ittapi' 2025-10-10T00:32:49.2818475Z Entering 'third_party/kineto' 2025-10-10T00:32:49.2920796Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:32:49.3025397Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:32:49.3156297Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:32:49.3244587Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:32:49.3335631Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:32:49.3415228Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:32:49.3501354Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:32:49.3586726Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:32:49.3680146Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:32:49.3794060Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:32:49.3877779Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:32:49.3962656Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:49.4098415Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:49.4205130Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:32:49.4293082Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:32:49.4392600Z Entering 'third_party/kleidiai' 2025-10-10T00:32:49.4503016Z Entering 'third_party/mimalloc' 2025-10-10T00:32:49.4603861Z Entering 'third_party/nlohmann' 2025-10-10T00:32:49.4737946Z Entering 'third_party/onnx' 2025-10-10T00:32:49.5535135Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:32:49.5650538Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:32:49.5814443Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:32:49.5902974Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:32:49.5995991Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:32:49.6077673Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:32:49.6200968Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:32:49.6287599Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:32:49.6373135Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:32:49.6457749Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:32:49.6576925Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:32:49.6675191Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:32:49.7369426Z Entering 'third_party/pocketfft' 2025-10-10T00:32:49.7458279Z Entering 'third_party/protobuf' 2025-10-10T00:32:49.7697692Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:32:49.7784630Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:32:49.7886069Z Entering 'third_party/psimd' 2025-10-10T00:32:49.7978789Z Entering 'third_party/pthreadpool' 2025-10-10T00:32:49.8072248Z Entering 'third_party/pybind11' 2025-10-10T00:32:49.8172592Z Entering 'third_party/python-peachpy' 2025-10-10T00:32:49.8260706Z Entering 'third_party/sleef' 2025-10-10T00:32:49.8362050Z Entering 'third_party/tensorpipe' 2025-10-10T00:32:49.8453769Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:32:49.8538999Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:32:49.8624484Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:32:49.8720239Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:32:49.8795819Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:32:49.9122010Z Prepare all required actions 2025-10-10T00:32:49.9122924Z Getting action download info 2025-10-10T00:32:50.0618977Z ##[group]Run ./.github/actions/setup-rocm 2025-10-10T00:32:50.0619507Z env: 2025-10-10T00:32:50.0619870Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.0620287Z ##[endgroup] 2025-10-10T00:32:50.0660575Z ##[group]Run dpkg -l | grep -E " rocm" 2025-10-10T00:32:50.0661101Z dpkg -l | grep -E " rocm" 2025-10-10T00:32:50.0713326Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.0713897Z env: 2025-10-10T00:32:50.0714331Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.0714702Z ##[endgroup] 2025-10-10T00:32:50.1152448Z ii rocm 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) software stack meta package 2025-10-10T00:32:50.1153821Z ii rocm-cmake 0.14.0.60303-74~22.04 amd64 rocm-cmake built using CMake 2025-10-10T00:32:50.1155320Z ii rocm-core 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1156512Z ii rocm-dbgapi 0.77.0.60303-74~22.04 amd64 Library to provide AMD GPU debugger API 2025-10-10T00:32:50.1157678Z ii rocm-debug-agent 2.0.3.60303-74~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-10-10T00:32:50.1158936Z ii rocm-developer-tools 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1160190Z ii rocm-device-libs 1.0.0.60303-74~22.04 amd64 Radeon Open Compute - device libraries 2025-10-10T00:32:50.1161180Z ii rocm-gdb 15.2.60303-74~22.04 amd64 ROCgdb 2025-10-10T00:32:50.1162261Z ii rocm-hip-libraries 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1163625Z ii rocm-hip-runtime 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1164827Z ii rocm-hip-runtime-dev 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1165984Z ii rocm-hip-sdk 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1167180Z ii rocm-language-runtime 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1168292Z ii rocm-llvm 18.0.0.25012.60303-74~22.04 amd64 ROCm core compiler 2025-10-10T00:32:50.1169382Z ii rocm-ml-libraries 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1170519Z ii rocm-ml-sdk 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1171516Z ii rocm-opencl 2.0.0.60303-74~22.04 amd64 clr built using CMake 2025-10-10T00:32:50.1172482Z ii rocm-opencl-dev 2.0.0.60303-74~22.04 amd64 clr built using CMake 2025-10-10T00:32:50.1173587Z ii rocm-opencl-runtime 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1174778Z ii rocm-opencl-sdk 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1181383Z ii rocm-openmp-sdk 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) OpenMP Software development Kit. 2025-10-10T00:32:50.1182580Z ii rocm-smi-lib 7.4.0.60303-74~22.04 amd64 AMD System Management libraries 2025-10-10T00:32:50.1183674Z ii rocm-utils 6.3.3.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-10-10T00:32:50.1184785Z ii rocminfo 1.0.0.60303-74~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-10-10T00:32:50.1224502Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T00:32:50.1225454Z # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T00:32:50.1226149Z # shellcheck disable=SC2046 2025-10-10T00:32:50.1226739Z docker stop $(docker ps -q) || true 2025-10-10T00:32:50.1227309Z # Prune all stopped containers. 2025-10-10T00:32:50.1227862Z docker container prune -f 2025-10-10T00:32:50.1284644Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.1285299Z env: 2025-10-10T00:32:50.1285674Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.1286111Z ##[endgroup] 2025-10-10T00:32:50.2025760Z "docker stop" requires at least 1 argument. 2025-10-10T00:32:50.2026725Z See 'docker stop --help'. 2025-10-10T00:32:50.2027087Z 2025-10-10T00:32:50.2027564Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-10-10T00:32:50.2028048Z 2025-10-10T00:32:50.2028260Z Stop one or more running containers 2025-10-10T00:32:50.2398790Z Total reclaimed space: 0B 2025-10-10T00:32:50.2491393Z ##[group]Run cat /etc/os-release || true 2025-10-10T00:32:50.2492011Z cat /etc/os-release || true 2025-10-10T00:32:50.2492605Z cat /etc/apt/sources.list.d/rocm.list || true 2025-10-10T00:32:50.2493278Z cat /opt/rocm/.info/version || true 2025-10-10T00:32:50.2493815Z whoami 2025-10-10T00:32:50.2547297Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.2547948Z env: 2025-10-10T00:32:50.2548307Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.2548724Z ##[endgroup] 2025-10-10T00:32:50.2662050Z PRETTY_NAME="Ubuntu 22.04.4 LTS" 2025-10-10T00:32:50.2662690Z NAME="Ubuntu" 2025-10-10T00:32:50.2663087Z VERSION_ID="22.04" 2025-10-10T00:32:50.2663573Z VERSION="22.04.4 LTS (Jammy Jellyfish)" 2025-10-10T00:32:50.2664142Z VERSION_CODENAME=jammy 2025-10-10T00:32:50.2664578Z ID=ubuntu 2025-10-10T00:32:50.2664948Z ID_LIKE=debian 2025-10-10T00:32:50.2665409Z HOME_URL="https://www.ubuntu.com/" 2025-10-10T00:32:50.2666026Z SUPPORT_URL="https://help.ubuntu.com/" 2025-10-10T00:32:50.2666719Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-10-10T00:32:50.2667715Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-10-10T00:32:50.2668680Z UBUNTU_CODENAME=jammy 2025-10-10T00:32:50.2689230Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.3.3 jammy main 2025-10-10T00:32:50.2717914Z 6.3.3-74 2025-10-10T00:32:50.2761435Z pytorchci 2025-10-10T00:32:50.2812714Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-10-10T00:32:50.2813368Z dpkg -l | grep -E " amdgpu" 2025-10-10T00:32:50.2869487Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.2870140Z env: 2025-10-10T00:32:50.2870513Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.2870943Z ##[endgroup] 2025-10-10T00:32:50.3310945Z ii amdgpu-core 1:6.3.60303-2119913.22.04 all Core meta package for unified amdgpu driver. 2025-10-10T00:32:50.3312166Z ii amdgpu-dkms 1:6.10.5.60303-2119913.22.04 all amdgpu driver in DKMS format. 2025-10-10T00:32:50.3313388Z ii amdgpu-dkms-firmware 1:6.10.5.60303-2119913.22.04 all firmware blobs used by amdgpu driver in DKMS format 2025-10-10T00:32:50.3315773Z ii amdgpu-install 6.3.60303-2119913.22.04 all AMDGPU driver repository and installer 2025-10-10T00:32:50.3426689Z ##[group]Run rocm-smi 2025-10-10T00:32:50.3427195Z rocm-smi 2025-10-10T00:32:50.3485486Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.3486158Z env: 2025-10-10T00:32:50.3486546Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.3486993Z ##[endgroup] 2025-10-10T00:32:50.5443971Z 2025-10-10T00:32:50.5444116Z 2025-10-10T00:32:50.5444691Z ========================================= ROCm System Management Interface ========================================= 2025-10-10T00:32:50.5445667Z =================================================== Concise Info =================================================== 2025-10-10T00:32:50.5446672Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-10-10T00:32:50.5448355Z  (DID, GUID) (Edge) (Avg) (Mem, Compute, ID)  2025-10-10T00:32:50.5449194Z ==================================================================================================================== 2025-10-10T00:32:50.5450449Z 0 4 0x740c, 57586 49.0°C 88.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2025-10-10T00:32:50.5451598Z 1 5 0x740c, 45873 41.0°C N/A N/A, N/A, 0 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2025-10-10T00:32:50.5452716Z 2 2 0x740c, 51627 32.0°C 96.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2025-10-10T00:32:50.5453786Z 3 3 0x740c, 64489 35.0°C N/A N/A, N/A, 0 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2025-10-10T00:32:50.5454829Z 4 8 0x740c, 30939 46.0°C 87.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2025-10-10T00:32:50.5455927Z 5 9 0x740c, 8466 42.0°C N/A N/A, N/A, 0 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2025-10-10T00:32:50.5456982Z 6 6 0x740c, 41154 37.0°C 86.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2025-10-10T00:32:50.5458049Z 7 7 0x740c, 63755 40.0°C N/A N/A, N/A, 0 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2025-10-10T00:32:50.5458819Z ==================================================================================================================== 2025-10-10T00:32:50.5459524Z =============================================== End of ROCm SMI Log ================================================ 2025-10-10T00:32:50.5608491Z ##[group]Run rocminfo 2025-10-10T00:32:50.5609054Z rocminfo 2025-10-10T00:32:50.5657310Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.5657691Z env: 2025-10-10T00:32:50.5657900Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.5658157Z ##[endgroup] 2025-10-10T00:32:50.7094892Z ROCk module version 6.10.5 is loaded 2025-10-10T00:32:50.7095612Z ===================== 2025-10-10T00:32:50.7096083Z HSA System Attributes 2025-10-10T00:32:50.7096517Z ===================== 2025-10-10T00:32:50.7096934Z Runtime Version: 1.14 2025-10-10T00:32:50.7097397Z Runtime Ext Version: 1.6 2025-10-10T00:32:50.7097861Z System Timestamp Freq.: 1000.000000MHz 2025-10-10T00:32:50.7098666Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-10-10T00:32:50.7099714Z Machine Model: LARGE 2025-10-10T00:32:50.7100565Z System Endianness: LITTLE 2025-10-10T00:32:50.7101326Z Mwaitx: DISABLED 2025-10-10T00:32:50.7101834Z DMAbuf Support: YES 2025-10-10T00:32:50.7102138Z 2025-10-10T00:32:50.7102289Z ========== 2025-10-10T00:32:50.7103351Z HSA Agents 2025-10-10T00:32:50.7103748Z ========== 2025-10-10T00:32:50.7104510Z ******* 2025-10-10T00:32:50.7104909Z Agent 1 2025-10-10T00:32:50.7105290Z ******* 2025-10-10T00:32:50.7105800Z Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:32:50.7106478Z Uuid: CPU-XX 2025-10-10T00:32:50.7107124Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:32:50.7107827Z Vendor Name: CPU 2025-10-10T00:32:50.7108445Z Feature: None specified 2025-10-10T00:32:50.7109075Z Profile: FULL_PROFILE 2025-10-10T00:32:50.7109712Z Float Round Mode: NEAR 2025-10-10T00:32:50.7110344Z Max Queue Number: 0(0x0) 2025-10-10T00:32:50.7110984Z Queue Min Size: 0(0x0) 2025-10-10T00:32:50.7111612Z Queue Max Size: 0(0x0) 2025-10-10T00:32:50.7112220Z Queue Type: MULTI 2025-10-10T00:32:50.7112809Z Node: 0 2025-10-10T00:32:50.7113406Z Device Type: CPU 2025-10-10T00:32:50.7113957Z Cache Info: 2025-10-10T00:32:50.7114604Z L1: 32768(0x8000) KB 2025-10-10T00:32:50.7115171Z Chip ID: 0(0x0) 2025-10-10T00:32:50.7115829Z ASIC Revision: 0(0x0) 2025-10-10T00:32:50.7116594Z Cacheline Size: 64(0x40) 2025-10-10T00:32:50.7117385Z Max Clock Freq. (MHz): 2000 2025-10-10T00:32:50.7118014Z BDFID: 0 2025-10-10T00:32:50.7118627Z Internal Node ID: 0 2025-10-10T00:32:50.7119262Z Compute Unit: 64 2025-10-10T00:32:50.7119881Z SIMDs per CU: 0 2025-10-10T00:32:50.7120508Z Shader Engines: 0 2025-10-10T00:32:50.7121175Z Shader Arrs. per Eng.: 0 2025-10-10T00:32:50.7121848Z WatchPts on Addr. Ranges:1 2025-10-10T00:32:50.7122437Z Memory Properties: 2025-10-10T00:32:50.7122877Z Features: None 2025-10-10T00:32:50.7123318Z Pool Info: 2025-10-10T00:32:50.7123738Z Pool 1 2025-10-10T00:32:50.7124285Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7124930Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:32:50.7125572Z Allocatable: TRUE 2025-10-10T00:32:50.7126236Z Alloc Granule: 4KB 2025-10-10T00:32:50.7126919Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7127613Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7128276Z Accessible by all: TRUE 2025-10-10T00:32:50.7128847Z Pool 2 2025-10-10T00:32:50.7129472Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7130220Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:32:50.7130994Z Allocatable: TRUE 2025-10-10T00:32:50.7131691Z Alloc Granule: 4KB 2025-10-10T00:32:50.7132380Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7133435Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7134398Z Accessible by all: TRUE 2025-10-10T00:32:50.7134986Z Pool 3 2025-10-10T00:32:50.7135523Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-10-10T00:32:50.7136141Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:32:50.7136760Z Allocatable: TRUE 2025-10-10T00:32:50.7137409Z Alloc Granule: 4KB 2025-10-10T00:32:50.7138078Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7138771Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7139549Z Accessible by all: TRUE 2025-10-10T00:32:50.7140240Z Pool 4 2025-10-10T00:32:50.7140870Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7141622Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:32:50.7142350Z Allocatable: TRUE 2025-10-10T00:32:50.7143009Z Alloc Granule: 4KB 2025-10-10T00:32:50.7143703Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7144382Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7145060Z Accessible by all: TRUE 2025-10-10T00:32:50.7145631Z ISA Info: 2025-10-10T00:32:50.7146064Z ******* 2025-10-10T00:32:50.7146477Z Agent 2 2025-10-10T00:32:50.7146857Z ******* 2025-10-10T00:32:50.7147332Z Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:32:50.7147954Z Uuid: CPU-XX 2025-10-10T00:32:50.7148603Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:32:50.7149291Z Vendor Name: CPU 2025-10-10T00:32:50.7149918Z Feature: None specified 2025-10-10T00:32:50.7150575Z Profile: FULL_PROFILE 2025-10-10T00:32:50.7151250Z Float Round Mode: NEAR 2025-10-10T00:32:50.7151891Z Max Queue Number: 0(0x0) 2025-10-10T00:32:50.7152545Z Queue Min Size: 0(0x0) 2025-10-10T00:32:50.7153172Z Queue Max Size: 0(0x0) 2025-10-10T00:32:50.7153781Z Queue Type: MULTI 2025-10-10T00:32:50.7154484Z Node: 1 2025-10-10T00:32:50.7155097Z Device Type: CPU 2025-10-10T00:32:50.7155706Z Cache Info: 2025-10-10T00:32:50.7156261Z L1: 32768(0x8000) KB 2025-10-10T00:32:50.7156940Z Chip ID: 0(0x0) 2025-10-10T00:32:50.7157676Z ASIC Revision: 0(0x0) 2025-10-10T00:32:50.7158373Z Cacheline Size: 64(0x40) 2025-10-10T00:32:50.7159297Z Max Clock Freq. (MHz): 2000 2025-10-10T00:32:50.7159913Z BDFID: 0 2025-10-10T00:32:50.7160529Z Internal Node ID: 1 2025-10-10T00:32:50.7161166Z Compute Unit: 64 2025-10-10T00:32:50.7161804Z SIMDs per CU: 0 2025-10-10T00:32:50.7162446Z Shader Engines: 0 2025-10-10T00:32:50.7163200Z Shader Arrs. per Eng.: 0 2025-10-10T00:32:50.7164481Z WatchPts on Addr. Ranges:1 2025-10-10T00:32:50.7165612Z Memory Properties: 2025-10-10T00:32:50.7166148Z Features: None 2025-10-10T00:32:50.7166715Z Pool Info: 2025-10-10T00:32:50.7167216Z Pool 1 2025-10-10T00:32:50.7167818Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7168473Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:32:50.7169114Z Allocatable: TRUE 2025-10-10T00:32:50.7169772Z Alloc Granule: 4KB 2025-10-10T00:32:50.7170476Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7171172Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7171872Z Accessible by all: TRUE 2025-10-10T00:32:50.7172480Z Pool 2 2025-10-10T00:32:50.7173017Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7173789Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:32:50.7174541Z Allocatable: TRUE 2025-10-10T00:32:50.7175321Z Alloc Granule: 4KB 2025-10-10T00:32:50.7176114Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7176804Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7177489Z Accessible by all: TRUE 2025-10-10T00:32:50.7178077Z Pool 3 2025-10-10T00:32:50.7178600Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-10-10T00:32:50.7179270Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:32:50.7179896Z Allocatable: TRUE 2025-10-10T00:32:50.7180547Z Alloc Granule: 4KB 2025-10-10T00:32:50.7181229Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7181959Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7182628Z Accessible by all: TRUE 2025-10-10T00:32:50.7183205Z Pool 4 2025-10-10T00:32:50.7183713Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7184335Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:32:50.7184951Z Allocatable: TRUE 2025-10-10T00:32:50.7185596Z Alloc Granule: 4KB 2025-10-10T00:32:50.7186285Z Alloc Recommended Granule:4KB 2025-10-10T00:32:50.7186985Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7187659Z Accessible by all: TRUE 2025-10-10T00:32:50.7188254Z ISA Info: 2025-10-10T00:32:50.7188661Z ******* 2025-10-10T00:32:50.7189066Z Agent 3 2025-10-10T00:32:50.7189464Z ******* 2025-10-10T00:32:50.7189911Z Name: gfx90a 2025-10-10T00:32:50.7190515Z Uuid: GPU-963d686164f2ce12 2025-10-10T00:32:50.7191171Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7191836Z Vendor Name: AMD 2025-10-10T00:32:50.7192486Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7193122Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7193773Z Float Round Mode: NEAR 2025-10-10T00:32:50.7195122Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7196277Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7197071Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7197834Z Queue Type: MULTI 2025-10-10T00:32:50.7198441Z Node: 2 2025-10-10T00:32:50.7199058Z Device Type: GPU 2025-10-10T00:32:50.7199629Z Cache Info: 2025-10-10T00:32:50.7200116Z L1: 16(0x10) KB 2025-10-10T00:32:50.7200680Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7201253Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7201889Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7202549Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7203294Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7204031Z BDFID: 12800 2025-10-10T00:32:50.7204754Z Internal Node ID: 2 2025-10-10T00:32:50.7205533Z Compute Unit: 104 2025-10-10T00:32:50.7206290Z SIMDs per CU: 4 2025-10-10T00:32:50.7207049Z Shader Engines: 8 2025-10-10T00:32:50.7207845Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7208545Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7209237Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7209861Z Memory Properties: 2025-10-10T00:32:50.7210355Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7210974Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7211660Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7212314Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7212929Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7213497Z x 1024(0x400) 2025-10-10T00:32:50.7214129Z y 1024(0x400) 2025-10-10T00:32:50.7214754Z z 1024(0x400) 2025-10-10T00:32:50.7215450Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7216215Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7216877Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7217444Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7217907Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7218445Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7218979Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7219593Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7228925Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7229728Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7230445Z IOMMU Support:: None 2025-10-10T00:32:50.7231034Z Pool Info: 2025-10-10T00:32:50.7231484Z Pool 1 2025-10-10T00:32:50.7232021Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7232662Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7233295Z Allocatable: TRUE 2025-10-10T00:32:50.7233947Z Alloc Granule: 4KB 2025-10-10T00:32:50.7235556Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7236431Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7237223Z Accessible by all: FALSE 2025-10-10T00:32:50.7237891Z Pool 2 2025-10-10T00:32:50.7238431Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7239073Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7239725Z Allocatable: TRUE 2025-10-10T00:32:50.7240394Z Alloc Granule: 4KB 2025-10-10T00:32:50.7241082Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7241783Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7242453Z Accessible by all: FALSE 2025-10-10T00:32:50.7243070Z Pool 3 2025-10-10T00:32:50.7243706Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7244433Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7245166Z Allocatable: TRUE 2025-10-10T00:32:50.7245938Z Alloc Granule: 4KB 2025-10-10T00:32:50.7246740Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7247556Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7248249Z Accessible by all: FALSE 2025-10-10T00:32:50.7248821Z Pool 4 2025-10-10T00:32:50.7249318Z Segment: GROUP 2025-10-10T00:32:50.7249910Z Size: 64(0x40) KB 2025-10-10T00:32:50.7250533Z Allocatable: FALSE 2025-10-10T00:32:50.7251192Z Alloc Granule: 0KB 2025-10-10T00:32:50.7251847Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7252524Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7253215Z Accessible by all: FALSE 2025-10-10T00:32:50.7253902Z ISA Info: 2025-10-10T00:32:50.7254420Z ISA 1 2025-10-10T00:32:50.7255071Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7255924Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7256633Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7257300Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7257999Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7258644Z Fast f16: TRUE 2025-10-10T00:32:50.7259278Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7259906Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7260487Z x 1024(0x400) 2025-10-10T00:32:50.7261059Z y 1024(0x400) 2025-10-10T00:32:50.7261620Z z 1024(0x400) 2025-10-10T00:32:50.7262224Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7262824Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7263325Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7263856Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7264422Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7265605Z FBarrier Max Size: 32 2025-10-10T00:32:50.7266193Z ******* 2025-10-10T00:32:50.7266604Z Agent 4 2025-10-10T00:32:50.7266995Z ******* 2025-10-10T00:32:50.7267462Z Name: gfx90a 2025-10-10T00:32:50.7268081Z Uuid: GPU-915b6eb937f8a736 2025-10-10T00:32:50.7268734Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7269405Z Vendor Name: AMD 2025-10-10T00:32:50.7270052Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7270681Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7271327Z Float Round Mode: NEAR 2025-10-10T00:32:50.7271978Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7272650Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7273299Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7273917Z Queue Type: MULTI 2025-10-10T00:32:50.7274888Z Node: 3 2025-10-10T00:32:50.7275512Z Device Type: GPU 2025-10-10T00:32:50.7276194Z Cache Info: 2025-10-10T00:32:50.7276779Z L1: 16(0x10) KB 2025-10-10T00:32:50.7277432Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7278088Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7278732Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7279377Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7280049Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7280691Z BDFID: 13568 2025-10-10T00:32:50.7281303Z Internal Node ID: 3 2025-10-10T00:32:50.7281954Z Compute Unit: 104 2025-10-10T00:32:50.7282582Z SIMDs per CU: 4 2025-10-10T00:32:50.7283277Z Shader Engines: 8 2025-10-10T00:32:50.7284077Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7284873Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7285692Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7286413Z Memory Properties: 2025-10-10T00:32:50.7286986Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7287696Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7288334Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7288923Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7289467Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7289926Z x 1024(0x400) 2025-10-10T00:32:50.7290403Z y 1024(0x400) 2025-10-10T00:32:50.7290873Z z 1024(0x400) 2025-10-10T00:32:50.7291385Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7291975Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7292559Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7293070Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7293564Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7294184Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7295336Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7296247Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7296906Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7297638Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7298237Z IOMMU Support:: None 2025-10-10T00:32:50.7298744Z Pool Info: 2025-10-10T00:32:50.7299128Z Pool 1 2025-10-10T00:32:50.7299614Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7300185Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7300756Z Allocatable: TRUE 2025-10-10T00:32:50.7301389Z Alloc Granule: 4KB 2025-10-10T00:32:50.7302149Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7302913Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7303622Z Accessible by all: FALSE 2025-10-10T00:32:50.7304240Z Pool 2 2025-10-10T00:32:50.7304790Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7305353Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7305908Z Allocatable: TRUE 2025-10-10T00:32:50.7306484Z Alloc Granule: 4KB 2025-10-10T00:32:50.7307106Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7307731Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7308327Z Accessible by all: FALSE 2025-10-10T00:32:50.7308855Z Pool 3 2025-10-10T00:32:50.7309327Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7309874Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7310428Z Allocatable: TRUE 2025-10-10T00:32:50.7311029Z Alloc Granule: 4KB 2025-10-10T00:32:50.7311642Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7312266Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7312859Z Accessible by all: FALSE 2025-10-10T00:32:50.7313376Z Pool 4 2025-10-10T00:32:50.7313822Z Segment: GROUP 2025-10-10T00:32:50.7314443Z Size: 64(0x40) KB 2025-10-10T00:32:50.7315002Z Allocatable: FALSE 2025-10-10T00:32:50.7315613Z Alloc Granule: 0KB 2025-10-10T00:32:50.7316336Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7317092Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7317789Z Accessible by all: FALSE 2025-10-10T00:32:50.7318321Z ISA Info: 2025-10-10T00:32:50.7318698Z ISA 1 2025-10-10T00:32:50.7319176Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7319820Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7320431Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7321026Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7321655Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7322574Z Fast f16: TRUE 2025-10-10T00:32:50.7323516Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7324192Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7324753Z x 1024(0x400) 2025-10-10T00:32:50.7325335Z y 1024(0x400) 2025-10-10T00:32:50.7325896Z z 1024(0x400) 2025-10-10T00:32:50.7326509Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7327141Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7327664Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7328148Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7328630Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7329176Z FBarrier Max Size: 32 2025-10-10T00:32:50.7329684Z ******* 2025-10-10T00:32:50.7330059Z Agent 5 2025-10-10T00:32:50.7330404Z ******* 2025-10-10T00:32:50.7330811Z Name: gfx90a 2025-10-10T00:32:50.7331341Z Uuid: GPU-2e1c5b2ef60aec01 2025-10-10T00:32:50.7331934Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7332535Z Vendor Name: AMD 2025-10-10T00:32:50.7333119Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7333814Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7334502Z Float Round Mode: NEAR 2025-10-10T00:32:50.7335181Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7335863Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7336462Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7337083Z Queue Type: MULTI 2025-10-10T00:32:50.7337619Z Node: 4 2025-10-10T00:32:50.7338150Z Device Type: GPU 2025-10-10T00:32:50.7338665Z Cache Info: 2025-10-10T00:32:50.7339076Z L1: 16(0x10) KB 2025-10-10T00:32:50.7339560Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7340066Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7340623Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7341182Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7341759Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7342295Z BDFID: 4352 2025-10-10T00:32:50.7342843Z Internal Node ID: 4 2025-10-10T00:32:50.7343416Z Compute Unit: 104 2025-10-10T00:32:50.7343961Z SIMDs per CU: 4 2025-10-10T00:32:50.7344552Z Shader Engines: 8 2025-10-10T00:32:50.7345143Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7345741Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7346362Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7346912Z Memory Properties: 2025-10-10T00:32:50.7347332Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7347886Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7348492Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7349348Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7350119Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7350579Z x 1024(0x400) 2025-10-10T00:32:50.7351067Z y 1024(0x400) 2025-10-10T00:32:50.7351547Z z 1024(0x400) 2025-10-10T00:32:50.7352068Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7352673Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7353258Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7353803Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7354340Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7354833Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7355344Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7356008Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7356760Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7357506Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7358158Z IOMMU Support:: None 2025-10-10T00:32:50.7358688Z Pool Info: 2025-10-10T00:32:50.7359084Z Pool 1 2025-10-10T00:32:50.7359566Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7360162Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7360747Z Allocatable: TRUE 2025-10-10T00:32:50.7361341Z Alloc Granule: 4KB 2025-10-10T00:32:50.7361969Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7362624Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7363320Z Accessible by all: FALSE 2025-10-10T00:32:50.7363950Z Pool 2 2025-10-10T00:32:50.7364527Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7365237Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7365912Z Allocatable: TRUE 2025-10-10T00:32:50.7366610Z Alloc Granule: 4KB 2025-10-10T00:32:50.7367357Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7368051Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7368652Z Accessible by all: FALSE 2025-10-10T00:32:50.7369185Z Pool 3 2025-10-10T00:32:50.7369643Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7370221Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7370787Z Allocatable: TRUE 2025-10-10T00:32:50.7371370Z Alloc Granule: 4KB 2025-10-10T00:32:50.7371982Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7372605Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7373224Z Accessible by all: FALSE 2025-10-10T00:32:50.7373850Z Pool 4 2025-10-10T00:32:50.7374366Z Segment: GROUP 2025-10-10T00:32:50.7375026Z Size: 64(0x40) KB 2025-10-10T00:32:50.7375689Z Allocatable: FALSE 2025-10-10T00:32:50.7376279Z Alloc Granule: 0KB 2025-10-10T00:32:50.7377459Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7378094Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7378697Z Accessible by all: FALSE 2025-10-10T00:32:50.7379225Z ISA Info: 2025-10-10T00:32:50.7379604Z ISA 1 2025-10-10T00:32:50.7380085Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7380733Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7381348Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7381972Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7382605Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7383183Z Fast f16: TRUE 2025-10-10T00:32:50.7383796Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7384401Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7384884Z x 1024(0x400) 2025-10-10T00:32:50.7385386Z y 1024(0x400) 2025-10-10T00:32:50.7385867Z z 1024(0x400) 2025-10-10T00:32:50.7386405Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7386951Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7387392Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7387894Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7388390Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7388935Z FBarrier Max Size: 32 2025-10-10T00:32:50.7389472Z ******* 2025-10-10T00:32:50.7389832Z Agent 6 2025-10-10T00:32:50.7390203Z ******* 2025-10-10T00:32:50.7390635Z Name: gfx90a 2025-10-10T00:32:50.7391163Z Uuid: GPU-885706dc5002792b 2025-10-10T00:32:50.7391753Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7392362Z Vendor Name: AMD 2025-10-10T00:32:50.7392921Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7393493Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7394292Z Float Round Mode: NEAR 2025-10-10T00:32:50.7394907Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7395484Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7396055Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7396626Z Queue Type: MULTI 2025-10-10T00:32:50.7397161Z Node: 5 2025-10-10T00:32:50.7397697Z Device Type: GPU 2025-10-10T00:32:50.7398206Z Cache Info: 2025-10-10T00:32:50.7398612Z L1: 16(0x10) KB 2025-10-10T00:32:50.7399106Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7399614Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7400158Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7400741Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7401326Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7401859Z BDFID: 5120 2025-10-10T00:32:50.7402717Z Internal Node ID: 5 2025-10-10T00:32:50.7403539Z Compute Unit: 104 2025-10-10T00:32:50.7404093Z SIMDs per CU: 4 2025-10-10T00:32:50.7404664Z Shader Engines: 8 2025-10-10T00:32:50.7405243Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7405853Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7406474Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7407008Z Memory Properties: 2025-10-10T00:32:50.7407452Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7408008Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7408599Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7409200Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7409847Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7410402Z x 1024(0x400) 2025-10-10T00:32:50.7410974Z y 1024(0x400) 2025-10-10T00:32:50.7411519Z z 1024(0x400) 2025-10-10T00:32:50.7412139Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7412819Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7413406Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7413934Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7414347Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7414835Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7415318Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7415874Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7416520Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7417156Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7417747Z IOMMU Support:: None 2025-10-10T00:32:50.7418267Z Pool Info: 2025-10-10T00:32:50.7418662Z Pool 1 2025-10-10T00:32:50.7419135Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7419725Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7420297Z Allocatable: TRUE 2025-10-10T00:32:50.7420898Z Alloc Granule: 4KB 2025-10-10T00:32:50.7421530Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7422158Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7422797Z Accessible by all: FALSE 2025-10-10T00:32:50.7423333Z Pool 2 2025-10-10T00:32:50.7423797Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7424362Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7424905Z Allocatable: TRUE 2025-10-10T00:32:50.7425490Z Alloc Granule: 4KB 2025-10-10T00:32:50.7426104Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7426715Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7427327Z Accessible by all: FALSE 2025-10-10T00:32:50.7427840Z Pool 3 2025-10-10T00:32:50.7428287Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7429133Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7429891Z Allocatable: TRUE 2025-10-10T00:32:50.7430502Z Alloc Granule: 4KB 2025-10-10T00:32:50.7431120Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7431724Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7432323Z Accessible by all: FALSE 2025-10-10T00:32:50.7432841Z Pool 4 2025-10-10T00:32:50.7433278Z Segment: GROUP 2025-10-10T00:32:50.7433816Z Size: 64(0x40) KB 2025-10-10T00:32:50.7434469Z Allocatable: FALSE 2025-10-10T00:32:50.7435045Z Alloc Granule: 0KB 2025-10-10T00:32:50.7435678Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7436324Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7436949Z Accessible by all: FALSE 2025-10-10T00:32:50.7437498Z ISA Info: 2025-10-10T00:32:50.7437868Z ISA 1 2025-10-10T00:32:50.7438363Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7439016Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7439688Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7440422Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7441173Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7441862Z Fast f16: TRUE 2025-10-10T00:32:50.7442579Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7443241Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7443825Z x 1024(0x400) 2025-10-10T00:32:50.7444481Z y 1024(0x400) 2025-10-10T00:32:50.7445028Z z 1024(0x400) 2025-10-10T00:32:50.7445687Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7446279Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7446707Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7447206Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7447884Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7448522Z FBarrier Max Size: 32 2025-10-10T00:32:50.7449041Z ******* 2025-10-10T00:32:50.7449438Z Agent 7 2025-10-10T00:32:50.7449786Z ******* 2025-10-10T00:32:50.7450210Z Name: gfx90a 2025-10-10T00:32:50.7450761Z Uuid: GPU-052333bdda4adfee 2025-10-10T00:32:50.7451365Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7452094Z Vendor Name: AMD 2025-10-10T00:32:50.7452755Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7453453Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7454145Z Float Round Mode: NEAR 2025-10-10T00:32:50.7454826Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7455417Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7455990Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7456850Z Queue Type: MULTI 2025-10-10T00:32:50.7457646Z Node: 6 2025-10-10T00:32:50.7458186Z Device Type: GPU 2025-10-10T00:32:50.7458699Z Cache Info: 2025-10-10T00:32:50.7459120Z L1: 16(0x10) KB 2025-10-10T00:32:50.7459612Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7460114Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7460659Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7461231Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7461804Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7462334Z BDFID: 44544 2025-10-10T00:32:50.7462887Z Internal Node ID: 6 2025-10-10T00:32:50.7463473Z Compute Unit: 104 2025-10-10T00:32:50.7464044Z SIMDs per CU: 4 2025-10-10T00:32:50.7464618Z Shader Engines: 8 2025-10-10T00:32:50.7465215Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7465831Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7466458Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7467006Z Memory Properties: 2025-10-10T00:32:50.7467425Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7467986Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7468582Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7469200Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7469778Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7470251Z x 1024(0x400) 2025-10-10T00:32:50.7470760Z y 1024(0x400) 2025-10-10T00:32:50.7471248Z z 1024(0x400) 2025-10-10T00:32:50.7471767Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7472371Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7472961Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7473521Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7473980Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7474573Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7475093Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7475668Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7476330Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7476965Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7477561Z IOMMU Support:: None 2025-10-10T00:32:50.7478089Z Pool Info: 2025-10-10T00:32:50.7478475Z Pool 1 2025-10-10T00:32:50.7478945Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7479559Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7480246Z Allocatable: TRUE 2025-10-10T00:32:50.7480946Z Alloc Granule: 4KB 2025-10-10T00:32:50.7481699Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7482460Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7483182Z Accessible by all: FALSE 2025-10-10T00:32:50.7484165Z Pool 2 2025-10-10T00:32:50.7485025Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7485731Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7486333Z Allocatable: TRUE 2025-10-10T00:32:50.7486931Z Alloc Granule: 4KB 2025-10-10T00:32:50.7487549Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7488184Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7488784Z Accessible by all: FALSE 2025-10-10T00:32:50.7489319Z Pool 3 2025-10-10T00:32:50.7489769Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7490332Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7490909Z Allocatable: TRUE 2025-10-10T00:32:50.7491546Z Alloc Granule: 4KB 2025-10-10T00:32:50.7492284Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7493012Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7493715Z Accessible by all: FALSE 2025-10-10T00:32:50.7494318Z Pool 4 2025-10-10T00:32:50.7494808Z Segment: GROUP 2025-10-10T00:32:50.7495348Z Size: 64(0x40) KB 2025-10-10T00:32:50.7515583Z Allocatable: FALSE 2025-10-10T00:32:50.7516412Z Alloc Granule: 0KB 2025-10-10T00:32:50.7517083Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7517768Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7518398Z Accessible by all: FALSE 2025-10-10T00:32:50.7518951Z ISA Info: 2025-10-10T00:32:50.7519345Z ISA 1 2025-10-10T00:32:50.7519909Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7520722Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7521481Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7522221Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7522960Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7523633Z Fast f16: TRUE 2025-10-10T00:32:50.7524315Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7524995Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7525571Z x 1024(0x400) 2025-10-10T00:32:50.7526139Z y 1024(0x400) 2025-10-10T00:32:50.7526610Z z 1024(0x400) 2025-10-10T00:32:50.7527135Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7527661Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7528108Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7528576Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7529058Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7529592Z FBarrier Max Size: 32 2025-10-10T00:32:50.7530103Z ******* 2025-10-10T00:32:50.7530476Z Agent 8 2025-10-10T00:32:50.7530825Z ******* 2025-10-10T00:32:50.7531705Z Name: gfx90a 2025-10-10T00:32:50.7532719Z Uuid: GPU-648b8d31dd305074 2025-10-10T00:32:50.7533430Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7534126Z Vendor Name: AMD 2025-10-10T00:32:50.7534788Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7535376Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7535960Z Float Round Mode: NEAR 2025-10-10T00:32:50.7536535Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7537116Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7537679Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7538233Z Queue Type: MULTI 2025-10-10T00:32:50.7538776Z Node: 7 2025-10-10T00:32:50.7539306Z Device Type: GPU 2025-10-10T00:32:50.7539821Z Cache Info: 2025-10-10T00:32:50.7540252Z L1: 16(0x10) KB 2025-10-10T00:32:50.7540754Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7541296Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7541887Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7542461Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7543052Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7543579Z BDFID: 45824 2025-10-10T00:32:50.7544137Z Internal Node ID: 7 2025-10-10T00:32:50.7544721Z Compute Unit: 104 2025-10-10T00:32:50.7545299Z SIMDs per CU: 4 2025-10-10T00:32:50.7545879Z Shader Engines: 8 2025-10-10T00:32:50.7546482Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7547081Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7547709Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7548264Z Memory Properties: 2025-10-10T00:32:50.7548699Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7549257Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7549845Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7550464Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7551014Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7551468Z x 1024(0x400) 2025-10-10T00:32:50.7551972Z y 1024(0x400) 2025-10-10T00:32:50.7552465Z z 1024(0x400) 2025-10-10T00:32:50.7552980Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7553584Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7554248Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7554772Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7555188Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7555658Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7556144Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7556694Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7557338Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7558283Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7559135Z IOMMU Support:: None 2025-10-10T00:32:50.7559694Z Pool Info: 2025-10-10T00:32:50.7560153Z Pool 1 2025-10-10T00:32:50.7560711Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7561412Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7562084Z Allocatable: TRUE 2025-10-10T00:32:50.7562772Z Alloc Granule: 4KB 2025-10-10T00:32:50.7563503Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7564253Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7564986Z Accessible by all: FALSE 2025-10-10T00:32:50.7565617Z Pool 2 2025-10-10T00:32:50.7566168Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7566781Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7567368Z Allocatable: TRUE 2025-10-10T00:32:50.7567949Z Alloc Granule: 4KB 2025-10-10T00:32:50.7568579Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7569209Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7569804Z Accessible by all: FALSE 2025-10-10T00:32:50.7570323Z Pool 3 2025-10-10T00:32:50.7570778Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7571328Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7571941Z Allocatable: TRUE 2025-10-10T00:32:50.7572635Z Alloc Granule: 4KB 2025-10-10T00:32:50.7573366Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7574099Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7574803Z Accessible by all: FALSE 2025-10-10T00:32:50.7575343Z Pool 4 2025-10-10T00:32:50.7575773Z Segment: GROUP 2025-10-10T00:32:50.7576310Z Size: 64(0x40) KB 2025-10-10T00:32:50.7576867Z Allocatable: FALSE 2025-10-10T00:32:50.7577446Z Alloc Granule: 0KB 2025-10-10T00:32:50.7578061Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7578677Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7579276Z Accessible by all: FALSE 2025-10-10T00:32:50.7579807Z ISA Info: 2025-10-10T00:32:50.7580175Z ISA 1 2025-10-10T00:32:50.7580673Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7581316Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7581912Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7582523Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7583141Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7583700Z Fast f16: TRUE 2025-10-10T00:32:50.7584277Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7584829Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7585305Z x 1024(0x400) 2025-10-10T00:32:50.7586248Z y 1024(0x400) 2025-10-10T00:32:50.7586727Z z 1024(0x400) 2025-10-10T00:32:50.7587259Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7587785Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7588212Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7588698Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7589173Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7589708Z FBarrier Max Size: 32 2025-10-10T00:32:50.7590221Z ******* 2025-10-10T00:32:50.7590574Z Agent 9 2025-10-10T00:32:50.7590923Z ******* 2025-10-10T00:32:50.7591326Z Name: gfx90a 2025-10-10T00:32:50.7591867Z Uuid: GPU-065f01543a0c255e 2025-10-10T00:32:50.7592463Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7593061Z Vendor Name: AMD 2025-10-10T00:32:50.7593626Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7594287Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7594855Z Float Round Mode: NEAR 2025-10-10T00:32:50.7595445Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7596017Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7596571Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7597135Z Queue Type: MULTI 2025-10-10T00:32:50.7597655Z Node: 8 2025-10-10T00:32:50.7598185Z Device Type: GPU 2025-10-10T00:32:50.7598695Z Cache Info: 2025-10-10T00:32:50.7599102Z L1: 16(0x10) KB 2025-10-10T00:32:50.7599593Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7600201Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7600842Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7601536Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7602226Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7602854Z BDFID: 36352 2025-10-10T00:32:50.7603507Z Internal Node ID: 8 2025-10-10T00:32:50.7604187Z Compute Unit: 104 2025-10-10T00:32:50.7604854Z SIMDs per CU: 4 2025-10-10T00:32:50.7605548Z Shader Engines: 8 2025-10-10T00:32:50.7606234Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7606851Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7607474Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7608015Z Memory Properties: 2025-10-10T00:32:50.7608442Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7608984Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7609571Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7610169Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7610740Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7611200Z x 1024(0x400) 2025-10-10T00:32:50.7612095Z y 1024(0x400) 2025-10-10T00:32:50.7612935Z z 1024(0x400) 2025-10-10T00:32:50.7613566Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7614263Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7614937Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7615459Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7615865Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7616348Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7616833Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7617382Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7618014Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7618626Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7619226Z IOMMU Support:: None 2025-10-10T00:32:50.7619754Z Pool Info: 2025-10-10T00:32:50.7620119Z Pool 1 2025-10-10T00:32:50.7620601Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7621195Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7621755Z Allocatable: TRUE 2025-10-10T00:32:50.7622362Z Alloc Granule: 4KB 2025-10-10T00:32:50.7622991Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7623606Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7624214Z Accessible by all: FALSE 2025-10-10T00:32:50.7624740Z Pool 2 2025-10-10T00:32:50.7625203Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7625790Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7626339Z Allocatable: TRUE 2025-10-10T00:32:50.7626926Z Alloc Granule: 4KB 2025-10-10T00:32:50.7627536Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7628142Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7628753Z Accessible by all: FALSE 2025-10-10T00:32:50.7629259Z Pool 3 2025-10-10T00:32:50.7629698Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7630235Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7630764Z Allocatable: TRUE 2025-10-10T00:32:50.7631330Z Alloc Granule: 4KB 2025-10-10T00:32:50.7631943Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7632551Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7633143Z Accessible by all: FALSE 2025-10-10T00:32:50.7633646Z Pool 4 2025-10-10T00:32:50.7634059Z Segment: GROUP 2025-10-10T00:32:50.7634877Z Size: 64(0x40) KB 2025-10-10T00:32:50.7635411Z Allocatable: FALSE 2025-10-10T00:32:50.7635978Z Alloc Granule: 0KB 2025-10-10T00:32:50.7636585Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7637176Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7637760Z Accessible by all: FALSE 2025-10-10T00:32:50.7638578Z ISA Info: 2025-10-10T00:32:50.7639223Z ISA 1 2025-10-10T00:32:50.7639698Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7640443Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7641156Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7641863Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7642578Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7643243Z Fast f16: TRUE 2025-10-10T00:32:50.7643909Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7644542Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7645095Z x 1024(0x400) 2025-10-10T00:32:50.7645666Z y 1024(0x400) 2025-10-10T00:32:50.7646021Z z 1024(0x400) 2025-10-10T00:32:50.7646310Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7646565Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7646766Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7646984Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7647200Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7647454Z FBarrier Max Size: 32 2025-10-10T00:32:50.7647682Z ******* 2025-10-10T00:32:50.7647839Z Agent 10 2025-10-10T00:32:50.7647994Z ******* 2025-10-10T00:32:50.7648173Z Name: gfx90a 2025-10-10T00:32:50.7648417Z Uuid: GPU-6d0b1913df2b2636 2025-10-10T00:32:50.7648688Z Marketing Name: AMD Instinct MI250X/MI250 2025-10-10T00:32:50.7648961Z Vendor Name: AMD 2025-10-10T00:32:50.7649217Z Feature: KERNEL_DISPATCH 2025-10-10T00:32:50.7649480Z Profile: BASE_PROFILE 2025-10-10T00:32:50.7649740Z Float Round Mode: NEAR 2025-10-10T00:32:50.7650009Z Max Queue Number: 128(0x80) 2025-10-10T00:32:50.7650275Z Queue Min Size: 64(0x40) 2025-10-10T00:32:50.7650522Z Queue Max Size: 131072(0x20000) 2025-10-10T00:32:50.7650774Z Queue Type: MULTI 2025-10-10T00:32:50.7651006Z Node: 9 2025-10-10T00:32:50.7651248Z Device Type: GPU 2025-10-10T00:32:50.7651479Z Cache Info: 2025-10-10T00:32:50.7651663Z L1: 16(0x10) KB 2025-10-10T00:32:50.7651890Z L2: 8192(0x2000) KB 2025-10-10T00:32:50.7652116Z Chip ID: 29708(0x740c) 2025-10-10T00:32:50.7652362Z ASIC Revision: 1(0x1) 2025-10-10T00:32:50.7652630Z Cacheline Size: 128(0x80) 2025-10-10T00:32:50.7652891Z Max Clock Freq. (MHz): 1700 2025-10-10T00:32:50.7653133Z BDFID: 37632 2025-10-10T00:32:50.7653386Z Internal Node ID: 9 2025-10-10T00:32:50.7653642Z Compute Unit: 104 2025-10-10T00:32:50.7653894Z SIMDs per CU: 4 2025-10-10T00:32:50.7654269Z Shader Engines: 8 2025-10-10T00:32:50.7654631Z Shader Arrs. per Eng.: 1 2025-10-10T00:32:50.7654911Z WatchPts on Addr. Ranges:4 2025-10-10T00:32:50.7655185Z Coherent Host Access: FALSE 2025-10-10T00:32:50.7655427Z Memory Properties: 2025-10-10T00:32:50.7655619Z Features: KERNEL_DISPATCH 2025-10-10T00:32:50.7655863Z Fast F16 Operation: TRUE 2025-10-10T00:32:50.7656132Z Wavefront Size: 64(0x40) 2025-10-10T00:32:50.7656399Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7656639Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7656844Z x 1024(0x400) 2025-10-10T00:32:50.7657056Z y 1024(0x400) 2025-10-10T00:32:50.7657273Z z 1024(0x400) 2025-10-10T00:32:50.7657520Z Max Waves Per CU: 32(0x20) 2025-10-10T00:32:50.7657786Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:32:50.7658051Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7658290Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7658474Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7658692Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7658914Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7659175Z Max fbarriers/Workgrp: 32 2025-10-10T00:32:50.7659470Z Packet Processor uCode:: 92 2025-10-10T00:32:50.7659755Z SDMA engine uCode:: 9 2025-10-10T00:32:50.7660043Z IOMMU Support:: None 2025-10-10T00:32:50.7660301Z Pool Info: 2025-10-10T00:32:50.7660484Z Pool 1 2025-10-10T00:32:50.7660710Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:32:50.7660978Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7661237Z Allocatable: TRUE 2025-10-10T00:32:50.7661514Z Alloc Granule: 4KB 2025-10-10T00:32:50.7661799Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7662095Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7662369Z Accessible by all: FALSE 2025-10-10T00:32:50.7662607Z Pool 2 2025-10-10T00:32:50.7662826Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:32:50.7663100Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7663360Z Allocatable: TRUE 2025-10-10T00:32:50.7663628Z Alloc Granule: 4KB 2025-10-10T00:32:50.7663908Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7664196Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7664480Z Accessible by all: FALSE 2025-10-10T00:32:50.7664714Z Pool 3 2025-10-10T00:32:50.7664920Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:32:50.7665181Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:32:50.7665434Z Allocatable: TRUE 2025-10-10T00:32:50.7665714Z Alloc Granule: 4KB 2025-10-10T00:32:50.7666113Z Alloc Recommended Granule:2048KB 2025-10-10T00:32:50.7666493Z Alloc Alignment: 4KB 2025-10-10T00:32:50.7666773Z Accessible by all: FALSE 2025-10-10T00:32:50.7667006Z Pool 4 2025-10-10T00:32:50.7667212Z Segment: GROUP 2025-10-10T00:32:50.7667454Z Size: 64(0x40) KB 2025-10-10T00:32:50.7667699Z Allocatable: FALSE 2025-10-10T00:32:50.7667966Z Alloc Granule: 0KB 2025-10-10T00:32:50.7668243Z Alloc Recommended Granule:0KB 2025-10-10T00:32:50.7668522Z Alloc Alignment: 0KB 2025-10-10T00:32:50.7668793Z Accessible by all: FALSE 2025-10-10T00:32:50.7669030Z ISA Info: 2025-10-10T00:32:50.7669199Z ISA 1 2025-10-10T00:32:50.7669423Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:32:50.7669714Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:32:50.7669992Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:32:50.7670268Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7670549Z Default Rounding Mode: NEAR 2025-10-10T00:32:50.7670812Z Fast f16: TRUE 2025-10-10T00:32:50.7671072Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:32:50.7671322Z Workgroup Max Size per Dimension: 2025-10-10T00:32:50.7671545Z x 1024(0x400) 2025-10-10T00:32:50.7671764Z y 1024(0x400) 2025-10-10T00:32:50.7671982Z z 1024(0x400) 2025-10-10T00:32:50.7672229Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:32:50.7672466Z Grid Max Size per Dimension: 2025-10-10T00:32:50.7672667Z x 4294967295(0xffffffff) 2025-10-10T00:32:50.7672884Z y 4294967295(0xffffffff) 2025-10-10T00:32:50.7673099Z z 4294967295(0xffffffff) 2025-10-10T00:32:50.7673346Z FBarrier Max Size: 32 2025-10-10T00:32:50.7673573Z *** Done *** 2025-10-10T00:32:50.7690229Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-10-10T00:32:50.7690566Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-10-10T00:32:50.7691083Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-10-10T00:32:50.7691576Z if [[ $ngpu -eq 0 ]]; then 2025-10-10T00:32:50.7691830Z  echo "Error: Failed to detect any GPUs on the runner" 2025-10-10T00:32:50.7692086Z  echo "$msg" 2025-10-10T00:32:50.7692252Z  exit 1 2025-10-10T00:32:50.7692406Z fi 2025-10-10T00:32:50.7714308Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.7714561Z env: 2025-10-10T00:32:50.7714707Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.7714876Z ##[endgroup] 2025-10-10T00:32:50.9412780Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-10-10T00:32:50.9413093Z with: 2025-10-10T00:32:50.9413254Z diskspace-cutoff: 70 2025-10-10T00:32:50.9413412Z env: 2025-10-10T00:32:50.9413574Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.9413746Z ##[endgroup] 2025-10-10T00:32:50.9447942Z ##[group]Run set -ex 2025-10-10T00:32:50.9448148Z set -ex 2025-10-10T00:32:50.9448321Z diskspace_cutoff=70 2025-10-10T00:32:50.9448765Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-10-10T00:32:50.9449200Z if [ ! -d "$docker_root_dir" ]; then 2025-10-10T00:32:50.9449614Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-10-10T00:32:50.9449958Z  exit 0 2025-10-10T00:32:50.9450119Z fi 2025-10-10T00:32:50.9450422Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-10-10T00:32:50.9451036Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-10-10T00:32:50.9451571Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-10-10T00:32:50.9451834Z  docker system prune -af 2025-10-10T00:32:50.9452200Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-10-10T00:32:50.9452618Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-10-10T00:32:50.9453049Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2025-10-10T00:32:50.9453415Z  echo "$msg" 2025-10-10T00:32:50.9453601Z  exit 1 2025-10-10T00:32:50.9453764Z  else 2025-10-10T00:32:50.9453965Z  difference=$((diskspace - diskspace_new)) 2025-10-10T00:32:50.9454247Z  echo "Diskspace saved: $difference percent" 2025-10-10T00:32:50.9454483Z  fi 2025-10-10T00:32:50.9454631Z fi 2025-10-10T00:32:50.9475185Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:50.9475454Z env: 2025-10-10T00:32:50.9475604Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:50.9475816Z ##[endgroup] 2025-10-10T00:32:50.9554281Z + diskspace_cutoff=70 2025-10-10T00:32:50.9564938Z ++ docker info -f '{{.DockerRootDir}}' 2025-10-10T00:32:51.0508180Z + docker_root_dir=/media/4TB/docker-rootless 2025-10-10T00:32:51.0508897Z + '[' '!' -d /media/4TB/docker-rootless ']' 2025-10-10T00:32:51.0524591Z ++ df -H --output=pcent /media/4TB/docker-rootless 2025-10-10T00:32:51.0526207Z ++ sed -n 2p 2025-10-10T00:32:51.0531595Z ++ sed s/%// 2025-10-10T00:32:51.0533396Z ++ sed 's/ //' 2025-10-10T00:32:51.0575558Z + diskspace=51 2025-10-10T00:32:51.0576692Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-10-10T00:32:51.0577878Z + [[ 51 -ge 70 ]] 2025-10-10T00:32:51.0625231Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-10-10T00:32:51.0626073Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-10-10T00:32:51.0626732Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-10-10T00:32:51.0627316Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-10-10T00:32:51.0628066Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-10-10T00:32:51.0628814Z  2025-10-10T00:32:51.0629342Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-10-10T00:32:51.0630040Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-10-10T00:32:51.0630631Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-10-10T00:32:51.0631435Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-10-10T00:32:51.0632180Z  2025-10-10T00:32:51.0632589Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-10-10T00:32:51.0633151Z rm -rf "${RUNNER_DOCS_DIR}" 2025-10-10T00:32:51.0633677Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-10-10T00:32:51.0634511Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-10-10T00:32:51.0683310Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:51.0683875Z env: 2025-10-10T00:32:51.0684210Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:51.0684590Z ##[endgroup] 2025-10-10T00:32:51.0998793Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:32:51.0999952Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:32:51.1000727Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:32:51.1044877Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:51.1045458Z env: 2025-10-10T00:32:51.1045792Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:51.1046434Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:51.1047399Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:51.1048302Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:51.1048893Z ##[endgroup] 2025-10-10T00:32:51.1264497Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-10-10T00:32:51.1265625Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-10-10T00:32:51.1266445Z # Add render group for container creation. 2025-10-10T00:32:51.1267095Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-10-10T00:32:51.1267878Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-10-10T00:32:51.1268660Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-10-10T00:32:51.1269317Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-10-10T00:32:51.1269861Z else 2025-10-10T00:32:51.1270249Z  DEVICE_FLAG="--device /dev/dri" 2025-10-10T00:32:51.1270696Z fi 2025-10-10T00:32:51.1271413Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-10-10T00:32:51.1272538Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-10-10T00:32:51.1273572Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-10-10T00:32:51.1274802Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-10-10T00:32:51.1276831Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-10-10T00:32:51.1322293Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:51.1322845Z env: 2025-10-10T00:32:51.1323174Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:51.1323813Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:51.1324774Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:51.1325676Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:51.1326301Z ##[endgroup] 2025-10-10T00:32:51.1633849Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-10-10T00:32:51.1634968Z with: 2025-10-10T00:32:51.1635603Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-10-10T00:32:51.1636524Z aws-region: us-east-1 2025-10-10T00:32:51.1637074Z role-duration-seconds: 18000 2025-10-10T00:32:51.1637654Z audience: sts.amazonaws.com 2025-10-10T00:32:51.1638146Z env: 2025-10-10T00:32:51.1638492Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:51.1639214Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:51.1640290Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:51.1641259Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:51.1643021Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:51.1644946Z ##[endgroup] 2025-10-10T00:32:51.4645815Z Assuming role with OIDC 2025-10-10T00:32:51.6438030Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-10-10T00:32:51.7159354Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-10-10T00:32:51.7160301Z with: 2025-10-10T00:32:51.7160740Z mask-password: true 2025-10-10T00:32:51.7161236Z registry-type: private 2025-10-10T00:32:51.7161736Z skip-logout: false 2025-10-10T00:32:51.7162162Z env: 2025-10-10T00:32:51.7162550Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:51.7163337Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:51.7164495Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:51.7165566Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:51.7167419Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:51.7169283Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:51.7169946Z AWS_REGION: us-east-1 2025-10-10T00:32:51.7171065Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:51.7171773Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:51.7181983Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:51.7182452Z ##[endgroup] 2025-10-10T00:32:52.0528437Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.3819040Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-10-10T00:32:52.3819790Z with: 2025-10-10T00:32:52.3820905Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.3822163Z use-custom-docker-registry: true 2025-10-10T00:32:52.3822651Z docker-build-dir: .ci/docker 2025-10-10T00:32:52.3823104Z docker-build-script: ./build.sh 2025-10-10T00:32:52.3823551Z working-directory: . 2025-10-10T00:32:52.3824088Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.3824689Z force-push: false 2025-10-10T00:32:52.3825041Z env: 2025-10-10T00:32:52.3825369Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:52.3826015Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:52.3826982Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:52.3827945Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:52.3829451Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:52.3830808Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:52.3831245Z AWS_REGION: us-east-1 2025-10-10T00:32:52.3831794Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:52.3832368Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:52.3841307Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:52.3841795Z ##[endgroup] 2025-10-10T00:32:52.3877410Z ##[group]Run set -ex 2025-10-10T00:32:52.3877868Z set -ex 2025-10-10T00:32:52.3878230Z  2025-10-10T00:32:52.3878834Z # If the docker build directory or the build script doesn't exist, the action will 2025-10-10T00:32:52.3879863Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-10-10T00:32:52.3880733Z # job could then download the pre-built image as usual 2025-10-10T00:32:52.3881783Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-10-10T00:32:52.3882739Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3883601Z else 2025-10-10T00:32:52.3884004Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3884678Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3885293Z  2025-10-10T00:32:52.3886126Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-10-10T00:32:52.3887082Z  exit 0 2025-10-10T00:32:52.3887418Z fi 2025-10-10T00:32:52.3887749Z  2025-10-10T00:32:52.3888274Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-10-10T00:32:52.3889182Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-10-10T00:32:52.3889981Z  # use it as it is, but first let's extract the tag 2025-10-10T00:32:52.3890704Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-10-10T00:32:52.3891489Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3892220Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3892819Z else 2025-10-10T00:32:52.3893222Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-10-10T00:32:52.3893793Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-10-10T00:32:52.3894432Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-10-10T00:32:52.3894954Z  fi 2025-10-10T00:32:52.3896023Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-10-10T00:32:52.3896947Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3897902Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3898945Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.3899589Z fi 2025-10-10T00:32:52.3946012Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:52.3946608Z env: 2025-10-10T00:32:52.3946961Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:52.3947614Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:52.3948616Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:52.3949532Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:52.3951047Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:52.3952386Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:52.3952832Z AWS_REGION: us-east-1 2025-10-10T00:32:52.3953329Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:52.3953914Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:52.3962512Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:52.3962913Z REPO_NAME: pytorch 2025-10-10T00:32:52.3964014Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.3965242Z DOCKER_BUILD_DIR: .ci/docker 2025-10-10T00:32:52.3965705Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-10-10T00:32:52.3966305Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.3966937Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-10-10T00:32:52.3967395Z CUSTOM_TAG_PREFIX: 2025-10-10T00:32:52.3967779Z ##[endgroup] 2025-10-10T00:32:52.4056115Z + [[ -d .ci/docker ]] 2025-10-10T00:32:52.4056792Z + [[ -f .ci/docker/./build.sh ]] 2025-10-10T00:32:52.4057386Z + [[ true == \t\r\u\e ]] 2025-10-10T00:32:52.4057867Z + echo skip=false 2025-10-10T00:32:52.4060657Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-10-10T00:32:52.4078362Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4079863Z ++ awk -F '[:,]' '{print $2}' 2025-10-10T00:32:52.4139529Z + DOCKER_TAG=pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4140979Z + echo docker-tag=pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4142946Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4200235Z ##[group]Run set +e 2025-10-10T00:32:52.4200724Z set +e 2025-10-10T00:32:52.4201080Z set -x 2025-10-10T00:32:52.4201421Z  2025-10-10T00:32:52.4201790Z login() { 2025-10-10T00:32:52.4202529Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-10-10T00:32:52.4203330Z } 2025-10-10T00:32:52.4203652Z  2025-10-10T00:32:52.4203979Z retry () { 2025-10-10T00:32:52.4204392Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-10-10T00:32:52.4204870Z } 2025-10-10T00:32:52.4205194Z  2025-10-10T00:32:52.4205552Z retry login "${DOCKER_REGISTRY}" 2025-10-10T00:32:52.4206018Z  2025-10-10T00:32:52.4206354Z START_TIME=$(date +%s) 2025-10-10T00:32:52.4207261Z # Wait up to 120 minutes 2025-10-10T00:32:52.4207833Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-10-10T00:32:52.4208578Z  # Check if image already exists, if it does then skip building it 2025-10-10T00:32:52.4209324Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-10-10T00:32:52.4209885Z  exit 0 2025-10-10T00:32:52.4210243Z  fi 2025-10-10T00:32:52.4210574Z  2025-10-10T00:32:52.4211177Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-10-10T00:32:52.4212358Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-10-10T00:32:52.4213412Z  # latter, it will wait for the Docker images to become available before continuing 2025-10-10T00:32:52.4214204Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-10-10T00:32:52.4214821Z  # It's a Docker build job, let's build the image 2025-10-10T00:32:52.4215367Z  break 2025-10-10T00:32:52.4215731Z  else 2025-10-10T00:32:52.4216255Z  # It's a regular build job, wait for the image to become available 2025-10-10T00:32:52.4216885Z  sleep 300 2025-10-10T00:32:52.4217262Z  fi 2025-10-10T00:32:52.4217595Z done 2025-10-10T00:32:52.4217945Z  2025-10-10T00:32:52.4218482Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-10-10T00:32:52.4219328Z # be empty. The default action would be to continue rebuild the image 2025-10-10T00:32:52.4220095Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-10-10T00:32:52.4220768Z  # if we're on the base branch then use the parent commit 2025-10-10T00:32:52.4221376Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-10-10T00:32:52.4221847Z else 2025-10-10T00:32:52.4222339Z  # otherwise we're on a PR, so use the most recent base commit 2025-10-10T00:32:52.4223038Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-10-10T00:32:52.4223574Z fi 2025-10-10T00:32:52.4223894Z  2025-10-10T00:32:52.4224249Z if [[ -z "${MERGE_BASE}" ]]; then 2025-10-10T00:32:52.4224784Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.4225614Z  2025-10-10T00:32:52.4226300Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-10-10T00:32:52.4227107Z  exit 0 2025-10-10T00:32:52.4227451Z fi 2025-10-10T00:32:52.4227771Z  2025-10-10T00:32:52.4228238Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-10-10T00:32:52.4229258Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-10-10T00:32:52.4230128Z  exit 1 2025-10-10T00:32:52.4230465Z fi 2025-10-10T00:32:52.4230803Z  2025-10-10T00:32:52.4231356Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-10-10T00:32:52.4232348Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-10-10T00:32:52.4233235Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-10-10T00:32:52.4234554Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-10-10T00:32:52.4235748Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-10-10T00:32:52.4236546Z fi 2025-10-10T00:32:52.4236924Z  2025-10-10T00:32:52.4237396Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:32:52.4286269Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:52.4286881Z env: 2025-10-10T00:32:52.4287248Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:52.4288260Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:52.4289283Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:52.4290214Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:52.4291824Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:52.4293374Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:52.4293827Z AWS_REGION: us-east-1 2025-10-10T00:32:52.4294389Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:52.4295002Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:52.4303473Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:52.4303898Z DOCKER_BUILD_DIR: .ci/docker 2025-10-10T00:32:52.4304419Z BASE_REVISION: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:32:52.4305717Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4307219Z DOCKER_TAG: pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:52.4308153Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.4308766Z DOCKER_PUSH: 2025-10-10T00:32:52.4309125Z ##[endgroup] 2025-10-10T00:32:52.4396823Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.4397623Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:52.4405389Z + aws ecr get-login-password --region us-east-1 2025-10-10T00:32:52.4408997Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:53.5486312Z WARNING! Your password will be stored unencrypted in /var/home/pytorchci/.docker/config.json. 2025-10-10T00:32:53.5487453Z Configure a credential helper to remove this warning. See 2025-10-10T00:32:53.5488565Z https://docs.docker.com/engine/reference/commandline/login/#credential-stores 2025-10-10T00:32:53.5489246Z 2025-10-10T00:32:53.5490039Z Login Succeeded 2025-10-10T00:32:53.5552276Z ++ date +%s 2025-10-10T00:32:53.5580426Z + START_TIME=1760056373 2025-10-10T00:32:53.5591025Z ++ date +%s 2025-10-10T00:32:53.5620233Z + [[ 1760049173 -lt 1760056373 ]] 2025-10-10T00:32:53.5622360Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:54.2371405Z { 2025-10-10T00:32:54.2371872Z "schemaVersion": 2, 2025-10-10T00:32:54.2372712Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-10-10T00:32:54.2373507Z "config": { 2025-10-10T00:32:54.2374228Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-10-10T00:32:54.2375092Z "size": 30258, 2025-10-10T00:32:54.2375966Z "digest": "sha256:89b57b826b3a69d70fb9aeb0f8c7d912e03c592d8e1e3f8a1b63f846423ddeef" 2025-10-10T00:32:54.2376797Z }, 2025-10-10T00:32:54.2377151Z "layers": [ 2025-10-10T00:32:54.2377515Z { 2025-10-10T00:32:54.2378117Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2378835Z "size": 30447990, 2025-10-10T00:32:54.2379543Z "digest": "sha256:828c1365039a657352c737a62d13e1932951b5658eb6bd9b9096ea9b73562453" 2025-10-10T00:32:54.2380340Z }, 2025-10-10T00:32:54.2380684Z { 2025-10-10T00:32:54.2381281Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2381989Z "size": 1552, 2025-10-10T00:32:54.2382698Z "digest": "sha256:7ac174ee79e283642648ca85bc1c5a9d145ee0eb3e7fd908cdaca6bd695c4d8b" 2025-10-10T00:32:54.2383492Z }, 2025-10-10T00:32:54.2383829Z { 2025-10-10T00:32:54.2384378Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2385072Z "size": 313650163, 2025-10-10T00:32:54.2385817Z "digest": "sha256:c5c3a7bcee471e68bf4c83148ee42d129ab0bced8c0d8146b0f243344bc2bcab" 2025-10-10T00:32:54.2387476Z }, 2025-10-10T00:32:54.2387835Z { 2025-10-10T00:32:54.2388395Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2389095Z "size": 702, 2025-10-10T00:32:54.2389783Z "digest": "sha256:3c2ece95710ec23db1db58d451eed7e74a03738f016217434a25a792d7ca89d5" 2025-10-10T00:32:54.2390569Z }, 2025-10-10T00:32:54.2390895Z { 2025-10-10T00:32:54.2391441Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2392123Z "size": 1216, 2025-10-10T00:32:54.2392802Z "digest": "sha256:6f806b94527710d95b9f4ff9519bb49ba6584b0d061c4875b725b1c00d4b5d73" 2025-10-10T00:32:54.2393707Z }, 2025-10-10T00:32:54.2394019Z { 2025-10-10T00:32:54.2394764Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2395466Z "size": 485, 2025-10-10T00:32:54.2396159Z "digest": "sha256:c85f6a6fe8c556bb8dad8012438ff20c006a4eb7d62e5d8d9fcbf895a94fc393" 2025-10-10T00:32:54.2396946Z }, 2025-10-10T00:32:54.2397287Z { 2025-10-10T00:32:54.2397818Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2398514Z "size": 110343846, 2025-10-10T00:32:54.2399226Z "digest": "sha256:e31af66554c7c4eb5ec5ebc6a8a0f60695de630c39ec4d9e2951e1539033c8f2" 2025-10-10T00:32:54.2400007Z }, 2025-10-10T00:32:54.2400330Z { 2025-10-10T00:32:54.2400874Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2401564Z "size": 4277, 2025-10-10T00:32:54.2402263Z "digest": "sha256:8aa37c38082d4e5576fb6ecedb8a4add36fe4809bb743b901dbe8a4c3f47134f" 2025-10-10T00:32:54.2403051Z }, 2025-10-10T00:32:54.2403374Z { 2025-10-10T00:32:54.2403907Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2404589Z "size": 1708, 2025-10-10T00:32:54.2405277Z "digest": "sha256:46b2487aad4b7fa6e5c4a2eebbd4630a74edba769496a0919a1e10568bbf80ed" 2025-10-10T00:32:54.2406069Z }, 2025-10-10T00:32:54.2406390Z { 2025-10-10T00:32:54.2406926Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2407607Z "size": 724, 2025-10-10T00:32:54.2408292Z "digest": "sha256:9605bad93e1d03d073e200df762b1bad4abb131eb14a5f7774cdb52b774c4d16" 2025-10-10T00:32:54.2409076Z }, 2025-10-10T00:32:54.2409402Z { 2025-10-10T00:32:54.2409934Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2410982Z "size": 3252894550, 2025-10-10T00:32:54.2411709Z "digest": "sha256:87552a77a639921abb9c5552585088ced7f39e46307e0315e9473dccc556a150" 2025-10-10T00:32:54.2412503Z }, 2025-10-10T00:32:54.2412851Z { 2025-10-10T00:32:54.2413418Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2414135Z "size": 380, 2025-10-10T00:32:54.2414846Z "digest": "sha256:a2156a35ccd3657d2627d9a8774ae2b66dedee5b9b1a8adb8a8a4716fdb9fa5c" 2025-10-10T00:32:54.2415627Z }, 2025-10-10T00:32:54.2415962Z { 2025-10-10T00:32:54.2416527Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2417229Z "size": 236059, 2025-10-10T00:32:54.2417933Z "digest": "sha256:b61ff137c8e0eda8a04d533ab71af86865882de2948f440b43fd3a39b8a356ed" 2025-10-10T00:32:54.2418715Z }, 2025-10-10T00:32:54.2419048Z { 2025-10-10T00:32:54.2419593Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2420303Z "size": 789, 2025-10-10T00:32:54.2420995Z "digest": "sha256:732b27f0ef2992ce183a289dd197b5ade0f423935e0b17cdafbac6c70deb27ea" 2025-10-10T00:32:54.2421785Z }, 2025-10-10T00:32:54.2422122Z { 2025-10-10T00:32:54.2422662Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2423357Z "size": 106, 2025-10-10T00:32:54.2424053Z "digest": "sha256:cc46d4ec636e9d735b857a2969cc4def1bb8022a4ab25a1f8f41a49a02bde5e3" 2025-10-10T00:32:54.2424856Z }, 2025-10-10T00:32:54.2425209Z { 2025-10-10T00:32:54.2425774Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2426825Z "size": 1495, 2025-10-10T00:32:54.2427587Z "digest": "sha256:adee52a759d60902c3a52ea8d07ed588fa1d4ff2387c78ec0dd0cc86c3c5855d" 2025-10-10T00:32:54.2428435Z }, 2025-10-10T00:32:54.2428841Z { 2025-10-10T00:32:54.2429499Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2430338Z "size": 453791726, 2025-10-10T00:32:54.2431091Z "digest": "sha256:64fe7c7cdfbb99a5abd63009e8f5a07fb37e7e63487a0379b1bea968c4f7e5dc" 2025-10-10T00:32:54.2431900Z }, 2025-10-10T00:32:54.2432245Z { 2025-10-10T00:32:54.2432799Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2433504Z "size": 163, 2025-10-10T00:32:54.2434315Z "digest": "sha256:ee78e0f72b875648b90996a9b0e31597efd23a2894b65f5a35e8b058867b3063" 2025-10-10T00:32:54.2435117Z }, 2025-10-10T00:32:54.2435453Z { 2025-10-10T00:32:54.2435997Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2436686Z "size": 104, 2025-10-10T00:32:54.2437402Z "digest": "sha256:1e905ae1f558dce89f972e0c93dfccf6cfe97d1bd8bdfe527c58919779c61f69" 2025-10-10T00:32:54.2438211Z }, 2025-10-10T00:32:54.2438550Z { 2025-10-10T00:32:54.2439089Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2439778Z "size": 724, 2025-10-10T00:32:54.2440473Z "digest": "sha256:9605bad93e1d03d073e200df762b1bad4abb131eb14a5f7774cdb52b774c4d16" 2025-10-10T00:32:54.2441261Z }, 2025-10-10T00:32:54.2441597Z { 2025-10-10T00:32:54.2442136Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2442832Z "size": 197, 2025-10-10T00:32:54.2443508Z "digest": "sha256:3516c5b05d6df7a6c03620ac59ad0e5b7fedff719299300fcd2e8914a3dad65d" 2025-10-10T00:32:54.2444285Z }, 2025-10-10T00:32:54.2444615Z { 2025-10-10T00:32:54.2445162Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2445898Z "size": 2579, 2025-10-10T00:32:54.2446704Z "digest": "sha256:1c76a8222270254d9be9a8460d786014446ea5524724e398fca7f523cf45ad10" 2025-10-10T00:32:54.2447551Z }, 2025-10-10T00:32:54.2447882Z { 2025-10-10T00:32:54.2448426Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2449119Z "size": 8207444317, 2025-10-10T00:32:54.2449816Z "digest": "sha256:f1887b9d2182358809b56a5c58d2e5fa2e1879c10e7ef9142bb78012485ae2f2" 2025-10-10T00:32:54.2451041Z }, 2025-10-10T00:32:54.2451368Z { 2025-10-10T00:32:54.2451904Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2452582Z "size": 135, 2025-10-10T00:32:54.2453498Z "digest": "sha256:dde85f9a277441f648277a83ea740623d54bea123f75b0e2cbafd57f838ddb37" 2025-10-10T00:32:54.2454483Z }, 2025-10-10T00:32:54.2454823Z { 2025-10-10T00:32:54.2455371Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2456106Z "size": 104, 2025-10-10T00:32:54.2456787Z "digest": "sha256:6bd3e31cfe9285924106cc395ecb2424f4e6010889910b4c6796cff56ed188bb" 2025-10-10T00:32:54.2457573Z }, 2025-10-10T00:32:54.2457903Z { 2025-10-10T00:32:54.2458443Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2459137Z "size": 613, 2025-10-10T00:32:54.2459828Z "digest": "sha256:60f03ebbba5f7c793e69b964f61ec13a19366f1e8af2f3672c4c2f83afe99803" 2025-10-10T00:32:54.2460631Z }, 2025-10-10T00:32:54.2460952Z { 2025-10-10T00:32:54.2461494Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2462190Z "size": 839542112, 2025-10-10T00:32:54.2462917Z "digest": "sha256:8ea747dcc0015c14149dfcd78ff4a959b153addd69aec77524d1b025f18df43e" 2025-10-10T00:32:54.2463705Z }, 2025-10-10T00:32:54.2464041Z { 2025-10-10T00:32:54.2464588Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2465274Z "size": 111, 2025-10-10T00:32:54.2465961Z "digest": "sha256:e068c2648c78ee7572fdcc7382c4d559d58b7b09ef7b2cb96f946e62921cd8c5" 2025-10-10T00:32:54.2467075Z }, 2025-10-10T00:32:54.2467417Z { 2025-10-10T00:32:54.2467970Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2468659Z "size": 1556, 2025-10-10T00:32:54.2469366Z "digest": "sha256:bb6360b8fc5c5e1c9e6ea2b6dc4c126dfac9d0077eb838fa8a6199e27b5a2e53" 2025-10-10T00:32:54.2470175Z }, 2025-10-10T00:32:54.2470502Z { 2025-10-10T00:32:54.2471041Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2471729Z "size": 107, 2025-10-10T00:32:54.2472405Z "digest": "sha256:6c5b495f2f53b98f68012497b19b376d5c2c40de273d842cfc67dcc3b67c79ca" 2025-10-10T00:32:54.2473183Z }, 2025-10-10T00:32:54.2473521Z { 2025-10-10T00:32:54.2474069Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2475139Z "size": 166, 2025-10-10T00:32:54.2475840Z "digest": "sha256:1d4404006668a675fa8e53f65ea4cf6ecd010ed802a89de58303f1c59635f786" 2025-10-10T00:32:54.2476642Z }, 2025-10-10T00:32:54.2476996Z { 2025-10-10T00:32:54.2477567Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2478296Z "size": 2940926, 2025-10-10T00:32:54.2479174Z "digest": "sha256:bc1f58ea6fa19dd5fe5fc202dee7b6c0ed814e0c8ad79dda3aaa0fd7a70e5e97" 2025-10-10T00:32:54.2480143Z }, 2025-10-10T00:32:54.2480535Z { 2025-10-10T00:32:54.2481117Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2481812Z "size": 107, 2025-10-10T00:32:54.2482490Z "digest": "sha256:d182b67595e1673c4b0240991d100a9f777c9804f07feeca9a6cdc3d04e00ca5" 2025-10-10T00:32:54.2483257Z }, 2025-10-10T00:32:54.2483590Z { 2025-10-10T00:32:54.2484132Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2484819Z "size": 827, 2025-10-10T00:32:54.2485489Z "digest": "sha256:93b9a85ec6046f661c48c0d95d99a71dc281b5fc7f18bc187880ff0c06fb6fc3" 2025-10-10T00:32:54.2486271Z }, 2025-10-10T00:32:54.2486603Z { 2025-10-10T00:32:54.2487154Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2487851Z "size": 26818595, 2025-10-10T00:32:54.2488555Z "digest": "sha256:39f425429d59597cce7e54c8bb6447d8fd2e2fc84c5074a547debcf87690fbb9" 2025-10-10T00:32:54.2489331Z }, 2025-10-10T00:32:54.2489662Z { 2025-10-10T00:32:54.2490209Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2491261Z "size": 104, 2025-10-10T00:32:54.2491941Z "digest": "sha256:73e22158c2fee9960acbc2921395d33ce9eab889f278bfc0019b5b3b5238226d" 2025-10-10T00:32:54.2492726Z }, 2025-10-10T00:32:54.2493057Z { 2025-10-10T00:32:54.2493603Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2494291Z "size": 425, 2025-10-10T00:32:54.2494992Z "digest": "sha256:bb9f49ea9c5d4f01913fe86aa1f9556adb5692d2ccddddae68ba8de4120def5a" 2025-10-10T00:32:54.2495814Z }, 2025-10-10T00:32:54.2496206Z { 2025-10-10T00:32:54.2496875Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2497667Z "size": 19309401, 2025-10-10T00:32:54.2498376Z "digest": "sha256:080fb704e09a99895724a289bd3a71f93dd01622f28eab958dedb9f9e6d39498" 2025-10-10T00:32:54.2499158Z }, 2025-10-10T00:32:54.2499495Z { 2025-10-10T00:32:54.2500046Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2500757Z "size": 639, 2025-10-10T00:32:54.2501460Z "digest": "sha256:ae187f30d9ff6ecf10bb335124a1a5ebb7d9711f9a74da5ad47845fbeb0ddff3" 2025-10-10T00:32:54.2502258Z }, 2025-10-10T00:32:54.2502596Z { 2025-10-10T00:32:54.2503142Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2503824Z "size": 724, 2025-10-10T00:32:54.2504519Z "digest": "sha256:9605bad93e1d03d073e200df762b1bad4abb131eb14a5f7774cdb52b774c4d16" 2025-10-10T00:32:54.2505322Z }, 2025-10-10T00:32:54.2505659Z { 2025-10-10T00:32:54.2506203Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2507221Z "size": 148, 2025-10-10T00:32:54.2507936Z "digest": "sha256:46a5515da8110a220af5c46fee40ac3c2ddde17afda7777e08e4a447768ae44c" 2025-10-10T00:32:54.2508727Z }, 2025-10-10T00:32:54.2509060Z { 2025-10-10T00:32:54.2509603Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2510287Z "size": 134, 2025-10-10T00:32:54.2510983Z "digest": "sha256:9e3e429b4137b006bb22d611c37ad504eae684eb1a5e2b177c24b6752f026b0c" 2025-10-10T00:32:54.2511783Z }, 2025-10-10T00:32:54.2512122Z { 2025-10-10T00:32:54.2512657Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2513351Z "size": 140, 2025-10-10T00:32:54.2514018Z "digest": "sha256:b6d5c62c0145cccf86b579889d6782401a405a0a1726a6067c1fad2366883422" 2025-10-10T00:32:54.2514930Z }, 2025-10-10T00:32:54.2515284Z { 2025-10-10T00:32:54.2515824Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2516514Z "size": 32, 2025-10-10T00:32:54.2517247Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:32:54.2518045Z }, 2025-10-10T00:32:54.2518375Z { 2025-10-10T00:32:54.2519011Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2519821Z "size": 223, 2025-10-10T00:32:54.2520643Z "digest": "sha256:6f5d3ca8a555b4cb15b8d96c2fe53ca7f542f793711e86a1672c657fcc4cfddf" 2025-10-10T00:32:54.2521454Z }, 2025-10-10T00:32:54.2521791Z { 2025-10-10T00:32:54.2522338Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2523035Z "size": 346, 2025-10-10T00:32:54.2523724Z "digest": "sha256:e53c3d3088da1bda7fc1354581e1768d5e91e7a5d82d5a52d09c01983b27232e" 2025-10-10T00:32:54.2524524Z }, 2025-10-10T00:32:54.2524854Z { 2025-10-10T00:32:54.2525408Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2526104Z "size": 88298, 2025-10-10T00:32:54.2526815Z "digest": "sha256:3299f6ba46346a593a24b7edfbaf68331d9d71726b1aba05609ad297da6db570" 2025-10-10T00:32:54.2527595Z }, 2025-10-10T00:32:54.2527915Z { 2025-10-10T00:32:54.2528451Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2529141Z "size": 106, 2025-10-10T00:32:54.2529810Z "digest": "sha256:b8903b144cdc5ee042c9762972511e716dcb33c2c5033eb9c19d1a4924b3403e" 2025-10-10T00:32:54.2530993Z }, 2025-10-10T00:32:54.2531323Z { 2025-10-10T00:32:54.2531864Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2532558Z "size": 1669, 2025-10-10T00:32:54.2533248Z "digest": "sha256:d581d2e596973e398e3de3d9f2f66dbf8ee6d54e906bdc46630a2b9e168e42de" 2025-10-10T00:32:54.2534034Z }, 2025-10-10T00:32:54.2534362Z { 2025-10-10T00:32:54.2534926Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2535619Z "size": 724, 2025-10-10T00:32:54.2536414Z "digest": "sha256:9605bad93e1d03d073e200df762b1bad4abb131eb14a5f7774cdb52b774c4d16" 2025-10-10T00:32:54.2537355Z }, 2025-10-10T00:32:54.2537699Z { 2025-10-10T00:32:54.2538247Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2538928Z "size": 138, 2025-10-10T00:32:54.2539609Z "digest": "sha256:d27558c9ffd622a5e3c4330342cb75239e0065d1dc05d0eac1a63b8f87996261" 2025-10-10T00:32:54.2540398Z }, 2025-10-10T00:32:54.2540722Z { 2025-10-10T00:32:54.2541259Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2541945Z "size": 120, 2025-10-10T00:32:54.2542619Z "digest": "sha256:673657e3d3eeffae943df424398f33b2fc63135d6165768c9a0f53b9f80e8a6e" 2025-10-10T00:32:54.2543401Z }, 2025-10-10T00:32:54.2543745Z { 2025-10-10T00:32:54.2544293Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2544993Z "size": 6199537580, 2025-10-10T00:32:54.2545730Z "digest": "sha256:d2edde2c6ef8c12c8f4709e9bb4febe7c1a467c950bd06c62c66d70be3f3c168" 2025-10-10T00:32:54.2549546Z }, 2025-10-10T00:32:54.2549926Z { 2025-10-10T00:32:54.2550498Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2551210Z "size": 176, 2025-10-10T00:32:54.2551915Z "digest": "sha256:df972c8e23408a36bb4519a2fdf5154099ed6acfb393cd9b46f5e3ac8fad94ab" 2025-10-10T00:32:54.2552724Z }, 2025-10-10T00:32:54.2553059Z { 2025-10-10T00:32:54.2553602Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2554417Z "size": 1897, 2025-10-10T00:32:54.2555129Z "digest": "sha256:a3a777f66caeecc2e8a5239f6f140e9bd55f8e97ea1cdba4fe46f60989a6b4da" 2025-10-10T00:32:54.2555937Z }, 2025-10-10T00:32:54.2556264Z { 2025-10-10T00:32:54.2556805Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2557495Z "size": 197573012, 2025-10-10T00:32:54.2558206Z "digest": "sha256:e152d75ccb9d31c4df936c02415d6dd05c79e8d8f916202be192ea6ffc03872b" 2025-10-10T00:32:54.2558983Z }, 2025-10-10T00:32:54.2559313Z { 2025-10-10T00:32:54.2559849Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2560536Z "size": 302, 2025-10-10T00:32:54.2561212Z "digest": "sha256:86c51c6ffd313c2f7259e9c2aea6bd24da432cdff1420bb30861d5f907f6bfe2" 2025-10-10T00:32:54.2561995Z }, 2025-10-10T00:32:54.2562330Z { 2025-10-10T00:32:54.2562868Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2563547Z "size": 32, 2025-10-10T00:32:54.2564224Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:32:54.2564997Z }, 2025-10-10T00:32:54.2565307Z { 2025-10-10T00:32:54.2565843Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2566529Z "size": 108, 2025-10-10T00:32:54.2567185Z "digest": "sha256:2bfd48767f2c2871079de3142c343766230e092f7064bda374c2cf9cc134f6b2" 2025-10-10T00:32:54.2567955Z }, 2025-10-10T00:32:54.2568275Z { 2025-10-10T00:32:54.2568818Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:32:54.2569508Z "size": 54145665, 2025-10-10T00:32:54.2570208Z "digest": "sha256:b958050798e4b6b9ad163dbd892ba56a2eda087652fd168bb432dcb9a2606897" 2025-10-10T00:32:54.2570980Z } 2025-10-10T00:32:54.2571301Z ] 2025-10-10T00:32:54.2571626Z } 2025-10-10T00:32:54.2571975Z + exit 0 2025-10-10T00:32:54.2621084Z ##[group]Run set -eux 2025-10-10T00:32:54.2621527Z set -eux 2025-10-10T00:32:54.2622155Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-10-10T00:32:54.2623885Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-10-10T00:32:54.2674240Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:54.2674846Z env: 2025-10-10T00:32:54.2675203Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:54.2675888Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:54.2676893Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:54.2677806Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:54.2679331Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:54.2680693Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:54.2681169Z AWS_REGION: us-east-1 2025-10-10T00:32:54.2681734Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:54.2682309Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:54.2690860Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:54.2691263Z ##[endgroup] 2025-10-10T00:32:54.2786076Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-10-10T00:32:54.2790042Z + jq --raw-output .SecretString 2025-10-10T00:32:54.2792476Z + jq -r .docker_hub_readonly_token 2025-10-10T00:32:54.2796934Z + docker login --username pytorchbot --password-stdin 2025-10-10T00:32:54.9954783Z 2025-10-10T00:32:54.9957728Z An error occurred (AccessDeniedException) when calling the GetSecretValue operation: User: arn:aws:sts::308535385114:assumed-role/gha_workflow_s3_and_ecr_read_only/GitHubActions is not authorized to perform: secretsmanager:GetSecretValue on resource: docker_hub_readonly_token because no identity-based policy allows the secretsmanager:GetSecretValue action 2025-10-10T00:32:55.0930494Z Error: Cannot perform an interactive login from a non TTY device 2025-10-10T00:32:55.0987309Z + true 2025-10-10T00:32:55.1153753Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-10-10T00:32:55.1154736Z with: 2025-10-10T00:32:55.1155975Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:55.1157514Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:55.1158200Z env: 2025-10-10T00:32:55.1158594Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:55.1159358Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:55.1160457Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:55.1161512Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:55.1163287Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:55.1164825Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:55.1165332Z AWS_REGION: us-east-1 2025-10-10T00:32:55.1165927Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:55.1166600Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:55.1176288Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:55.1176762Z ##[endgroup] 2025-10-10T00:32:55.1206722Z ##[group]Run set -x 2025-10-10T00:32:55.1207159Z set -x 2025-10-10T00:32:55.1207516Z set +e 2025-10-10T00:32:55.1207862Z  2025-10-10T00:32:55.1208189Z login() { 2025-10-10T00:32:55.1208947Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-10-10T00:32:55.1210104Z } 2025-10-10T00:32:55.1210432Z  2025-10-10T00:32:55.1210760Z retry () { 2025-10-10T00:32:55.1211181Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-10-10T00:32:55.1211663Z } 2025-10-10T00:32:55.1211979Z  2025-10-10T00:32:55.1212352Z retry login "${DOCKER_REGISTRY}" 2025-10-10T00:32:55.1212819Z  2025-10-10T00:32:55.1213559Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-10-10T00:32:55.1214541Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-10-10T00:32:55.1215102Z  2025-10-10T00:32:55.1215422Z set -e 2025-10-10T00:32:55.1215957Z # ignore output since only exit code is used for conditional 2025-10-10T00:32:55.1216698Z # only pull docker image if it's not available locally 2025-10-10T00:32:55.1217524Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-10-10T00:32:55.1218282Z  retry docker pull "${DOCKER_IMAGE}" 2025-10-10T00:32:55.1218790Z fi 2025-10-10T00:32:55.1263344Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:55.1263935Z env: 2025-10-10T00:32:55.1264305Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:55.1264972Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:55.1265958Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:55.1266867Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:55.1268366Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:55.1269722Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:55.1270180Z AWS_REGION: us-east-1 2025-10-10T00:32:55.1270673Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:55.1271248Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:55.1279942Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:55.1281484Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:55.1282935Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:55.1283642Z ##[endgroup] 2025-10-10T00:32:55.1374033Z + set +e 2025-10-10T00:32:55.1374673Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:55.1375515Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:55.1384654Z + aws ecr get-login-password --region us-east-1 2025-10-10T00:32:55.1387544Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:32:56.2410795Z WARNING! Your password will be stored unencrypted in /var/home/pytorchci/.docker/config.json. 2025-10-10T00:32:56.2412000Z Configure a credential helper to remove this warning. See 2025-10-10T00:32:56.2413047Z https://docs.docker.com/engine/reference/commandline/login/#credential-stores 2025-10-10T00:32:56.2413752Z 2025-10-10T00:32:56.2415019Z Login Succeeded 2025-10-10T00:32:56.2485930Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:56.2487910Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-10-10T00:32:56.9819779Z + IMAGE_SIZE=18795.79888534546 2025-10-10T00:32:56.9820512Z + echo 'Compressed size of image in MB: 18795.79888534546' 2025-10-10T00:32:56.9821234Z + set -e 2025-10-10T00:32:56.9822587Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:32:56.9824143Z Compressed size of image in MB: 18795.79888534546 2025-10-10T00:32:57.0300368Z Prepare all required actions 2025-10-10T00:32:57.0365894Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-10-10T00:32:57.0366528Z with: 2025-10-10T00:32:57.0367360Z github-token: *** 2025-10-10T00:32:57.0367797Z env: 2025-10-10T00:32:57.0368199Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:57.0369115Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:57.0370357Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:57.0371401Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:57.0373137Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:57.0374683Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:57.0375180Z AWS_REGION: us-east-1 2025-10-10T00:32:57.0375854Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:57.0376532Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:57.0386158Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:57.0386611Z ##[endgroup] 2025-10-10T00:32:57.0418739Z ##[group]Run set -eux 2025-10-10T00:32:57.0419218Z set -eux 2025-10-10T00:32:57.0419981Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-10-10T00:32:57.0472897Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:32:57.0473570Z env: 2025-10-10T00:32:57.0473965Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:57.0475082Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:57.0476364Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:57.0477598Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:57.0479372Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:57.0480930Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:57.0481429Z AWS_REGION: us-east-1 2025-10-10T00:32:57.0481989Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:57.0482853Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:57.0492853Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:57.0493528Z GITHUB_TOKEN: *** 2025-10-10T00:32:57.0493940Z ##[endgroup] 2025-10-10T00:32:57.0588894Z + python3 .github/scripts/get_workflow_job_id.py 18392306192 gpud501 2025-10-10T00:32:57.6209935Z Setting output job-id=52406492265 2025-10-10T00:32:57.6210933Z Setting output job-name=linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:32:57.6468705Z Prepare all required actions 2025-10-10T00:32:57.6469408Z Getting action download info 2025-10-10T00:32:57.8333228Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-10-10T00:32:58.3261614Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-10-10T00:32:58.8483538Z ##[group]Run ./.github/actions/download-build-artifacts 2025-10-10T00:32:58.8483884Z with: 2025-10-10T00:32:58.8484114Z name: linux-jammy-rocm-py3.10 2025-10-10T00:32:58.8484386Z s3-bucket: gha-artifacts 2025-10-10T00:32:58.8484620Z env: 2025-10-10T00:32:58.8484821Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:58.8485197Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:58.8485755Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:58.8486322Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:58.8487178Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:58.8488392Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:58.8488644Z AWS_REGION: us-east-1 2025-10-10T00:32:58.8488950Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:58.8489271Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:58.8494278Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:58.8494467Z ##[endgroup] 2025-10-10T00:32:58.8520324Z ##[group]Run seemethere/download-artifact-s3@v4 2025-10-10T00:32:58.8520625Z with: 2025-10-10T00:32:58.8520839Z name: linux-jammy-rocm-py3.10 2025-10-10T00:32:58.8521117Z s3-bucket: gha-artifacts 2025-10-10T00:32:58.8521361Z region: us-east-1 2025-10-10T00:32:58.8521568Z env: 2025-10-10T00:32:58.8521767Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:32:58.8522153Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:32:58.8522701Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:32:58.8523209Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:32:58.8524063Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:32:58.8524819Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:32:58.8525071Z AWS_REGION: us-east-1 2025-10-10T00:32:58.8525341Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:32:58.8525672Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:32:58.8530379Z AWS_SESSION_TOKEN: *** 2025-10-10T00:32:58.8530615Z ##[endgroup] 2025-10-10T00:32:59.3027936Z (node:1428601) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-10-10T00:32:59.3028881Z 2025-10-10T00:32:59.3029280Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-10-10T00:32:59.3030281Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-10-10T00:32:59.3031335Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-10-10T00:32:59.4630096Z Found 1 objects with prefix pytorch/pytorch/18392306192/linux-jammy-rocm-py3.10/ 2025-10-10T00:32:59.4631503Z Starting download (1/1): /var/home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-10-10T00:33:16.2022940Z Finished download (1/1): /var/home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-10-10T00:33:16.2035392Z Artifact download has finished successfully 2025-10-10T00:33:16.2603047Z ##[group]Run unzip -o artifacts.zip 2025-10-10T00:33:16.2603776Z unzip -o artifacts.zip 2025-10-10T00:33:16.2661222Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:16.2661907Z env: 2025-10-10T00:33:16.2662307Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:16.2663600Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:16.2664760Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:16.2665807Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:16.2667534Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:16.2669056Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:16.2669563Z AWS_REGION: us-east-1 2025-10-10T00:33:16.2670135Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:16.2670777Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:16.2680544Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:16.2681086Z ##[endgroup] 2025-10-10T00:33:16.2816338Z Archive: artifacts.zip 2025-10-10T00:33:16.2818414Z creating: dist/ 2025-10-10T00:33:19.5272874Z inflating: dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl 2025-10-10T00:33:19.5404261Z inflating: dist/.ninja_log 2025-10-10T00:33:19.5411066Z creating: build/custom_test_artifacts/ 2025-10-10T00:33:19.5411943Z creating: build/custom_test_artifacts/custom-op-build/ 2025-10-10T00:33:19.5413701Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-10-10T00:33:19.5414737Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:33:19.5415935Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:33:19.5417078Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-10-10T00:33:19.5418187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:33:19.5419363Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:33:19.5420544Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:33:19.5421891Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:33:19.5423272Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:33:19.5424536Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:33:19.5425746Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:33:19.5426991Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:33:19.5428429Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:33:19.5429867Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:33:19.5431200Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:33:19.5432647Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:33:19.5434404Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:33:19.5435732Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:33:19.5436777Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:33:19.5437859Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-10-10T00:33:19.5438981Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-10-10T00:33:19.5440215Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-10-10T00:33:19.5441641Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-10-10T00:33:19.5443449Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-10-10T00:33:19.5444730Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-10-10T00:33:19.5480171Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-10-10T00:33:19.5481833Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-10-10T00:33:19.5483269Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-10-10T00:33:19.5484640Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-10-10T00:33:19.5486018Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-10-10T00:33:19.5487352Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-10-10T00:33:19.5640411Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-10-10T00:33:19.5641757Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-10-10T00:33:19.5643118Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-10-10T00:33:19.5645113Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-10-10T00:33:19.5646533Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-10-10T00:33:19.5647861Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-10-10T00:33:19.5649256Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-10-10T00:33:19.5650643Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-10-10T00:33:19.5652018Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-10-10T00:33:19.5653369Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-10-10T00:33:19.5654737Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-10-10T00:33:19.5665472Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-10-10T00:33:19.5740835Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-10-10T00:33:19.5742436Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:33:19.5743826Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:33:19.5745180Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-10-10T00:33:19.5746323Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-10-10T00:33:19.5747444Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-10-10T00:33:19.5748612Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-10-10T00:33:19.5749707Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-10-10T00:33:19.5750723Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-10-10T00:33:19.5751684Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-10-10T00:33:19.5752638Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-10-10T00:33:19.5907826Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-10-10T00:33:19.5959240Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-10-10T00:33:19.5960796Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-10-10T00:33:19.5961671Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-10-10T00:33:19.5962670Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:33:19.5964456Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:33:19.5965715Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-10-10T00:33:19.5966860Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:33:19.5968081Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:33:19.5969262Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:33:19.5970616Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:33:19.5972036Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:33:19.5973310Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:33:19.5974530Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:33:19.5976432Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:33:19.5977814Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:33:19.5979223Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:33:19.5980501Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:33:19.5981876Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:33:19.5983373Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:33:19.5984667Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:33:19.5985688Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:33:19.5986792Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-10-10T00:33:19.5987908Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-10-10T00:33:19.5989162Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-10-10T00:33:19.5990606Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-10-10T00:33:19.5991990Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-10-10T00:33:19.5993292Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-10-10T00:33:19.5994825Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-10-10T00:33:19.5996164Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-10-10T00:33:19.5997510Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-10-10T00:33:19.5998830Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-10-10T00:33:19.6000163Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-10-10T00:33:19.6004480Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-10-10T00:33:19.6063750Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-10-10T00:33:19.6065920Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:33:19.6067315Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:33:19.6068598Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-10-10T00:33:19.6069734Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-10-10T00:33:19.6070832Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-10-10T00:33:19.6071948Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-10-10T00:33:19.6073052Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-10-10T00:33:19.6074252Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-10-10T00:33:19.6075189Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-10-10T00:33:19.6076148Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-10-10T00:33:19.6106125Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-10-10T00:33:19.6107165Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-10-10T00:33:19.6108651Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-10-10T00:33:19.6109766Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:33:19.6111065Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:33:19.6112275Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-10-10T00:33:19.6113472Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:33:19.6114911Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:33:19.6116164Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:33:19.6117603Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:33:19.6119058Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:33:19.6120426Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:33:19.6121732Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:33:19.6123001Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:33:19.6124516Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:33:19.6126038Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:33:19.6127438Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:33:19.6128942Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:33:19.6130540Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:33:19.6131922Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:33:19.6133057Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:33:19.6134207Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-10-10T00:33:19.6135431Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-10-10T00:33:19.6136805Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-10-10T00:33:19.6138701Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-10-10T00:33:19.6140231Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-10-10T00:33:19.6141636Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-10-10T00:33:19.6143096Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-10-10T00:33:19.6144593Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-10-10T00:33:19.6146060Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-10-10T00:33:19.6147520Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-10-10T00:33:19.6148987Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-10-10T00:33:19.6150565Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-10-10T00:33:19.6247059Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-10-10T00:33:19.6249009Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-10-10T00:33:19.6250494Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-10-10T00:33:19.6252165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-10-10T00:33:19.6253765Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-10-10T00:33:19.6255271Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-10-10T00:33:19.6256808Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-10-10T00:33:19.6258353Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-10-10T00:33:19.6259950Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-10-10T00:33:19.6261489Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-10-10T00:33:19.6263041Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-10-10T00:33:19.6272249Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-10-10T00:33:19.6324357Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-10-10T00:33:19.6326073Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:33:19.6327544Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:33:19.6328882Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-10-10T00:33:19.6330094Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-10-10T00:33:19.6331267Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-10-10T00:33:19.6332482Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-10-10T00:33:19.6333691Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-10-10T00:33:19.6334821Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-10-10T00:33:19.6336231Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-10-10T00:33:19.6337286Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-10-10T00:33:19.6424775Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-10-10T00:33:19.6460663Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-10-10T00:33:19.6461525Z creating: build/lib/ 2025-10-10T00:33:19.6536687Z inflating: build/lib/libprotobuf-lite.a 2025-10-10T00:33:19.6943582Z inflating: build/lib/libprotobuf.a 2025-10-10T00:33:19.7396393Z inflating: build/lib/libprotoc.a 2025-10-10T00:33:19.7404854Z inflating: build/lib/libpthreadpool.a 2025-10-10T00:33:19.7412398Z inflating: build/lib/libcpuinfo.a 2025-10-10T00:33:19.7419502Z inflating: build/lib/libcpuinfo_internals.a 2025-10-10T00:33:19.7420246Z inflating: build/lib/libclog.a 2025-10-10T00:33:19.7437587Z inflating: build/lib/libpytorch_qnnpack.a 2025-10-10T00:33:19.7438450Z inflating: build/lib/libnnpack_reference_layers.a 2025-10-10T00:33:19.7608928Z inflating: build/lib/libmicrokernels-prod.a 2025-10-10T00:33:19.7625033Z inflating: build/lib/libnnpack.a 2025-10-10T00:33:19.8419580Z inflating: build/lib/libmicrokernels-all.a 2025-10-10T00:33:19.8484226Z inflating: build/lib/libgtest.a 2025-10-10T00:33:19.8499998Z inflating: build/lib/libgmock.a 2025-10-10T00:33:19.8500682Z inflating: build/lib/libgmock_main.a 2025-10-10T00:33:19.8501271Z inflating: build/lib/libgtest_main.a 2025-10-10T00:33:19.8583481Z inflating: build/lib/libXNNPACK.a 2025-10-10T00:33:19.8651896Z inflating: build/lib/libbenchmark.a 2025-10-10T00:33:19.8652608Z inflating: build/lib/libbenchmark_main.a 2025-10-10T00:33:19.8653240Z inflating: build/lib/libjitprofiling.a 2025-10-10T00:33:19.8660569Z inflating: build/lib/libittnotify.a 2025-10-10T00:33:19.8719589Z inflating: build/lib/libasmjit.a 2025-10-10T00:33:19.9829542Z inflating: build/lib/libfbgemm.a 2025-10-10T00:33:19.9856908Z inflating: build/lib/libtensorpipe_uv.a 2025-10-10T00:33:20.0372831Z inflating: build/lib/libtensorpipe.a 2025-10-10T00:33:20.0483720Z inflating: build/lib/libgloo.a 2025-10-10T00:33:20.0527173Z inflating: build/lib/libonnx_proto.a 2025-10-10T00:33:20.0937137Z inflating: build/lib/libgloo_hip.a 2025-10-10T00:33:20.1598989Z inflating: build/lib/libonnx.a 2025-10-10T00:33:21.1112304Z inflating: build/lib/libdnnl.a 2025-10-10T00:33:21.1129658Z inflating: build/lib/libfmt.a 2025-10-10T00:33:21.1407429Z inflating: build/lib/libkineto.a 2025-10-10T00:33:21.1511898Z inflating: build/lib/libc10.so 2025-10-10T00:33:21.1512646Z inflating: build/lib/libtorch_global_deps.so 2025-10-10T00:33:21.1563098Z inflating: build/lib/libc10_hip.so 2025-10-10T00:33:21.1563816Z inflating: build/lib/libcaffe2_nvrtc.so 2025-10-10T00:33:21.2029809Z inflating: build/lib/libfbgemm_genai.a 2025-10-10T00:33:23.9540583Z inflating: build/lib/libtorch_cpu.so 2025-10-10T00:33:23.9544216Z inflating: build/lib/libshm.so 2025-10-10T00:33:24.8594716Z inflating: build/lib/libtorch_hip.so 2025-10-10T00:33:24.8595412Z inflating: build/lib/libtorch.so 2025-10-10T00:33:24.8612632Z inflating: build/lib/libjitbackend_test.so 2025-10-10T00:33:24.8677842Z inflating: build/lib/libtorchbind_test.so 2025-10-10T00:33:24.8699525Z inflating: build/lib/libbackend_with_compiler.so 2025-10-10T00:33:24.8724083Z inflating: build/lib/libaoti_custom_ops.so 2025-10-10T00:33:25.0788796Z inflating: build/lib/libtorch_python.so 2025-10-10T00:33:25.0820775Z inflating: build/lib/libnnapi_backend.so 2025-10-10T00:33:25.0821450Z creating: build/bin/ 2025-10-10T00:33:25.0821944Z creating: build/bin/CMakeFiles/ 2025-10-10T00:33:25.0822526Z inflating: build/bin/cmake_install.cmake 2025-10-10T00:33:25.0823165Z inflating: build/bin/CTestTestfile.cmake 2025-10-10T00:33:25.1232938Z inflating: build/bin/protoc-3.13.0.0 2025-10-10T00:33:25.1642360Z inflating: build/bin/protoc 2025-10-10T00:33:25.1695261Z inflating: build/bin/c10_AllocatorConfig_test 2025-10-10T00:33:25.1745104Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-10-10T00:33:25.1796215Z inflating: build/bin/c10_DeviceGuard_test 2025-10-10T00:33:25.1847815Z inflating: build/bin/c10_Device_test 2025-10-10T00:33:25.1907219Z inflating: build/bin/c10_DispatchKeySet_test 2025-10-10T00:33:25.1956308Z inflating: build/bin/c10_StreamGuard_test 2025-10-10T00:33:25.2012957Z inflating: build/bin/c10_SymInt_test 2025-10-10T00:33:25.2066797Z inflating: build/bin/c10_Scalar_test 2025-10-10T00:33:25.2122449Z inflating: build/bin/c10_SizesAndStrides_test 2025-10-10T00:33:25.2177635Z inflating: build/bin/c10_InlineStreamGuard_test 2025-10-10T00:33:25.2231574Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-10-10T00:33:25.2281167Z inflating: build/bin/c10_ArrayRef_test 2025-10-10T00:33:25.2333711Z inflating: build/bin/c10_Bitset_test 2025-10-10T00:33:25.2402739Z inflating: build/bin/c10_cow_test 2025-10-10T00:33:25.2451464Z inflating: build/bin/c10_ConstexprCrc_test 2025-10-10T00:33:25.2501991Z inflating: build/bin/c10_Half_test 2025-10-10T00:33:25.2551394Z inflating: build/bin/c10_DeadlockDetection_test 2025-10-10T00:33:25.2608149Z inflating: build/bin/c10_Enumerate_test 2025-10-10T00:33:25.2663353Z inflating: build/bin/c10_LeftRight_test 2025-10-10T00:33:25.2716170Z inflating: build/bin/c10_IntrusiveList_test 2025-10-10T00:33:25.2771147Z inflating: build/bin/c10_Metaprogramming_test 2025-10-10T00:33:25.2820822Z inflating: build/bin/c10_Semaphore_test 2025-10-10T00:33:25.2875823Z inflating: build/bin/c10_ThreadLocal_test 2025-10-10T00:33:25.2928812Z inflating: build/bin/c10_NetworkFlow_test 2025-10-10T00:33:25.2978467Z inflating: build/bin/c10_Synchronized_test 2025-10-10T00:33:25.3030114Z inflating: build/bin/c10_TypeIndex_test 2025-10-10T00:33:25.3080887Z inflating: build/bin/c10_TypeList_test 2025-10-10T00:33:25.3130255Z inflating: build/bin/c10_TypeTraits_test 2025-10-10T00:33:25.3181574Z inflating: build/bin/c10_accumulate_test 2025-10-10T00:33:25.3237945Z inflating: build/bin/c10_complex_math_test 2025-10-10T00:33:25.3293133Z inflating: build/bin/c10_bfloat16_test 2025-10-10T00:33:25.3343459Z inflating: build/bin/c10_bit_cast_test 2025-10-10T00:33:25.3395381Z inflating: build/bin/c10_exception_test 2025-10-10T00:33:25.3450302Z inflating: build/bin/c10_complex_test 2025-10-10T00:33:25.3499646Z inflating: build/bin/c10_error_test 2025-10-10T00:33:25.3549798Z inflating: build/bin/c10_flags_test 2025-10-10T00:33:25.3599979Z inflating: build/bin/c10_generic_math_test 2025-10-10T00:33:25.3761078Z inflating: build/bin/c10_intrusive_ptr_test 2025-10-10T00:33:25.3811964Z inflating: build/bin/c10_irange_test 2025-10-10T00:33:25.3864872Z inflating: build/bin/c10_lazy_test 2025-10-10T00:33:25.3921712Z inflating: build/bin/c10_logging_test 2025-10-10T00:33:25.3982603Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-10-10T00:33:25.4056263Z inflating: build/bin/c10_optional_test 2025-10-10T00:33:25.4109009Z inflating: build/bin/c10_registry_test 2025-10-10T00:33:25.4259325Z inflating: build/bin/c10_small_vector_test 2025-10-10T00:33:25.4311230Z inflating: build/bin/c10_ssize_test 2025-10-10T00:33:25.4367537Z inflating: build/bin/c10_string_util_test 2025-10-10T00:33:25.4417377Z inflating: build/bin/c10_tempfile_test 2025-10-10T00:33:25.4466526Z inflating: build/bin/c10_string_view_test 2025-10-10T00:33:25.4509783Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-10-10T00:33:25.4565711Z inflating: build/bin/c10_typeid_test 2025-10-10T00:33:25.4614619Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-10-10T00:33:25.4663677Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-10-10T00:33:25.4713066Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-10-10T00:33:25.4761548Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-10-10T00:33:25.4810447Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-10-10T00:33:25.4859390Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-10-10T00:33:25.4908169Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-10-10T00:33:25.4957289Z inflating: build/bin/c10_hip_HIPTest 2025-10-10T00:33:25.5516768Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-10-10T00:33:25.6091245Z inflating: build/bin/vec_test_all_types_AVX512 2025-10-10T00:33:25.6670998Z inflating: build/bin/vec_test_all_types_AVX2 2025-10-10T00:33:25.6723195Z inflating: build/bin/BackoffTest 2025-10-10T00:33:25.6776448Z inflating: build/bin/FileStoreTest 2025-10-10T00:33:25.6832673Z inflating: build/bin/TCPStoreTest 2025-10-10T00:33:25.6886125Z inflating: build/bin/HashStoreTest 2025-10-10T00:33:25.6950985Z inflating: build/bin/ProcessGroupGlooTest 2025-10-10T00:33:25.6952873Z inflating: build/bin/example_allreduce 2025-10-10T00:33:25.6957193Z inflating: build/bin/torch_shm_manager 2025-10-10T00:33:25.7010549Z inflating: build/bin/static_runtime_bench 2025-10-10T00:33:25.7252343Z inflating: build/bin/static_runtime_test 2025-10-10T00:33:25.7324975Z inflating: build/bin/Dict_test 2025-10-10T00:33:25.7377193Z inflating: build/bin/Dimname_test 2025-10-10T00:33:25.7441628Z inflating: build/bin/MaybeOwned_test 2025-10-10T00:33:25.7498903Z inflating: build/bin/NamedTensor_test 2025-10-10T00:33:25.7557105Z inflating: build/bin/apply_utils_test 2025-10-10T00:33:25.7615782Z inflating: build/bin/atest 2025-10-10T00:33:25.7678902Z inflating: build/bin/basic 2025-10-10T00:33:25.7734028Z inflating: build/bin/broadcast_test 2025-10-10T00:33:25.7784491Z inflating: build/bin/cpu_allocator_test 2025-10-10T00:33:25.7842443Z inflating: build/bin/cpu_generator_test 2025-10-10T00:33:25.7895233Z inflating: build/bin/cpu_profiling_allocator_test 2025-10-10T00:33:25.7985179Z inflating: build/bin/cpu_rng_test 2025-10-10T00:33:25.8035907Z inflating: build/bin/dlconvertor_test 2025-10-10T00:33:25.8092694Z inflating: build/bin/extension_backend_test 2025-10-10T00:33:25.8147576Z inflating: build/bin/half_test 2025-10-10T00:33:25.8240961Z inflating: build/bin/ivalue_test 2025-10-10T00:33:25.8290439Z inflating: build/bin/lazy_tensor_test 2025-10-10T00:33:25.8344315Z inflating: build/bin/math_kernel_test 2025-10-10T00:33:25.8397621Z inflating: build/bin/memory_overlapping_test 2025-10-10T00:33:25.8451311Z inflating: build/bin/memory_format_test 2025-10-10T00:33:25.8504199Z inflating: build/bin/mobile_memory_cleanup 2025-10-10T00:33:25.8560067Z inflating: build/bin/native_test 2025-10-10T00:33:25.8610712Z inflating: build/bin/operator_name_test 2025-10-10T00:33:25.8662632Z inflating: build/bin/packedtensoraccessor_test 2025-10-10T00:33:25.8713363Z inflating: build/bin/operators_test 2025-10-10T00:33:25.8770425Z inflating: build/bin/quantized_test 2025-10-10T00:33:25.8836674Z inflating: build/bin/pow_test 2025-10-10T00:33:25.8886663Z inflating: build/bin/reduce_ops_test 2025-10-10T00:33:25.8937324Z inflating: build/bin/reportMemoryUsage_test 2025-10-10T00:33:25.8988583Z inflating: build/bin/StorageUtils_test 2025-10-10T00:33:25.9044691Z inflating: build/bin/scalar_tensor_test 2025-10-10T00:33:25.9102838Z inflating: build/bin/scalar_test 2025-10-10T00:33:25.9154474Z inflating: build/bin/stride_properties_test 2025-10-10T00:33:25.9232542Z inflating: build/bin/tensor_iterator_test 2025-10-10T00:33:25.9286798Z inflating: build/bin/test_parallel 2025-10-10T00:33:25.9341704Z inflating: build/bin/type_ptr_test 2025-10-10T00:33:25.9392285Z inflating: build/bin/thread_init_test 2025-10-10T00:33:25.9450682Z inflating: build/bin/type_test 2025-10-10T00:33:25.9502850Z inflating: build/bin/undefined_tensor_test 2025-10-10T00:33:25.9552453Z inflating: build/bin/verify_api_visibility 2025-10-10T00:33:25.9620928Z inflating: build/bin/legacy_vmap_test 2025-10-10T00:33:25.9672128Z inflating: build/bin/weakref_test 2025-10-10T00:33:25.9723431Z inflating: build/bin/wrapdim_test 2025-10-10T00:33:25.9782284Z inflating: build/bin/IListRef_test 2025-10-10T00:33:25.9833490Z inflating: build/bin/xla_tensor_test 2025-10-10T00:33:25.9938181Z inflating: build/bin/List_test 2025-10-10T00:33:26.0054193Z inflating: build/bin/kernel_function_legacy_test 2025-10-10T00:33:26.0146730Z inflating: build/bin/kernel_function_test 2025-10-10T00:33:26.0211881Z inflating: build/bin/KernelFunction_test 2025-10-10T00:33:26.0334239Z inflating: build/bin/kernel_lambda_legacy_test 2025-10-10T00:33:26.0433624Z inflating: build/bin/kernel_lambda_test 2025-10-10T00:33:26.0526119Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-10-10T00:33:26.0585645Z inflating: build/bin/kernel_stackbased_test 2025-10-10T00:33:26.0636809Z inflating: build/bin/CppSignature_test 2025-10-10T00:33:26.0685484Z inflating: build/bin/op_allowlist_test 2025-10-10T00:33:26.0984320Z inflating: build/bin/op_registration_test 2025-10-10T00:33:26.1049975Z inflating: build/bin/inline_container_test 2025-10-10T00:33:26.1099178Z inflating: build/bin/hip_complex_math_test 2025-10-10T00:33:26.1153949Z inflating: build/bin/backend_fallback_test 2025-10-10T00:33:26.1206814Z inflating: build/bin/hip_apply_test 2025-10-10T00:33:26.1255490Z inflating: build/bin/hip_complex_test 2025-10-10T00:33:26.1304605Z inflating: build/bin/hip_distributions_test 2025-10-10T00:33:26.1353136Z inflating: build/bin/hip_generator_test 2025-10-10T00:33:26.1402745Z inflating: build/bin/hip_half_test 2025-10-10T00:33:26.1451679Z inflating: build/bin/hip_integer_divider_test 2025-10-10T00:33:26.1500590Z inflating: build/bin/hip_optional_test 2025-10-10T00:33:26.1549514Z inflating: build/bin/hip_packedtensoraccessor_test 2025-10-10T00:33:26.1601023Z inflating: build/bin/hip_dlconvertor_test 2025-10-10T00:33:26.1649694Z inflating: build/bin/hip_vectorized_test 2025-10-10T00:33:26.2674048Z inflating: build/bin/test_jit 2025-10-10T00:33:26.2728562Z inflating: build/bin/test_dist_autograd 2025-10-10T00:33:26.2795657Z inflating: build/bin/test_cpp_rpc 2025-10-10T00:33:26.3887149Z inflating: build/bin/test_api 2025-10-10T00:33:26.3888425Z inflating: build/bin/parallel_benchmark 2025-10-10T00:33:26.4219935Z inflating: build/bin/test_lazy 2025-10-10T00:33:26.4220601Z creating: .additional_ci_files/ 2025-10-10T00:33:26.4284371Z inflating: .additional_ci_files/test-times.json 2025-10-10T00:33:26.4524769Z inflating: .additional_ci_files/test-class-times.json 2025-10-10T00:33:26.4588882Z ##[group]Run rm artifacts.zip 2025-10-10T00:33:26.4589446Z rm artifacts.zip 2025-10-10T00:33:26.4646439Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:26.4647242Z env: 2025-10-10T00:33:26.4647713Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:26.4648648Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:26.4649996Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:26.4651043Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:26.4652768Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:26.4654327Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:26.4654843Z AWS_REGION: us-east-1 2025-10-10T00:33:26.4655443Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:26.4656757Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:26.4666802Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:26.4667261Z ##[endgroup] 2025-10-10T00:33:26.7209240Z ##[group]Run df -H 2025-10-10T00:33:26.7209666Z df -H 2025-10-10T00:33:26.7261708Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:26.7262405Z env: 2025-10-10T00:33:26.7262801Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:26.7263546Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:26.7264648Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:26.7265710Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:26.7267444Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:26.7268978Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:26.7269496Z AWS_REGION: us-east-1 2025-10-10T00:33:26.7270052Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:26.7270719Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:26.7281276Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:26.7281828Z ##[endgroup] 2025-10-10T00:33:26.7455778Z Filesystem Size Used Avail Use% Mounted on 2025-10-10T00:33:26.7456680Z tmpfs 109G 33M 109G 1% /run 2025-10-10T00:33:26.7457473Z /dev/nvme0n1p2 944G 73G 824G 9% / 2025-10-10T00:33:26.7458247Z tmpfs 542G 33k 542G 1% /dev/shm 2025-10-10T00:33:26.7458937Z tmpfs 5.3M 0 5.3M 0% /run/lock 2025-10-10T00:33:26.7459669Z /dev/nvme0n1p1 536M 6.4M 530M 2% /boot/efi 2025-10-10T00:33:26.7460465Z /dev/nvme1n1p1 3.8T 1.9T 1.8T 51% /media/4TB 2025-10-10T00:33:26.7461266Z tmpfs 109G 33k 109G 1% /run/user/1307800118 2025-10-10T00:33:26.7462080Z 172.18.148.8:/export/amd2 5.5T 278G 5.3T 6% /mnt 2025-10-10T00:33:26.7463076Z pure1.jax.cs.cpe.ice.amd.com:/homes/amd-pytorch 108G 412M 107G 1% /home/amd-pytorch 2025-10-10T00:33:26.7464193Z pure2.jax.cs.cpe.ice.amd.com:/GroupStorage 165T 143T 23T 87% /groups 2025-10-10T00:33:26.7465284Z pure2.jax.cs.cpe.ice.amd.com:/GroupStorage/Scratch 5.5T 1.8T 3.8T 33% /scratch 2025-10-10T00:33:26.7466479Z pure1.jax.cs.cpe.ice.amd.com:/homes/nlingamp 108G 2.2G 106G 3% /home/nlingamp 2025-10-10T00:33:26.7467564Z pure1.jax.cs.cpe.ice.amd.com:/homes/manitera 108G 1.1M 108G 1% /home/manitera 2025-10-10T00:33:26.7468846Z pure1.jax.cs.cpe.ice.amd.com:/homes/okakarpa 108G 4.5G 103G 5% /home/okakarpa 2025-10-10T00:33:26.7532863Z Prepare all required actions 2025-10-10T00:33:26.7533713Z Getting action download info 2025-10-10T00:33:26.9715626Z ##[group]Run ./.github/actions/download-td-artifacts 2025-10-10T00:33:26.9716181Z with: 2025-10-10T00:33:26.9716509Z env: 2025-10-10T00:33:26.9716849Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:26.9717521Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:26.9718547Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:26.9719453Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:26.9720991Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:26.9722342Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:26.9722794Z AWS_REGION: us-east-1 2025-10-10T00:33:26.9723346Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:26.9723944Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:26.9732461Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:26.9732868Z ##[endgroup] 2025-10-10T00:33:26.9807174Z ##[group]Run seemethere/download-artifact-s3@v4 2025-10-10T00:33:26.9807691Z with: 2025-10-10T00:33:26.9808030Z name: td_results 2025-10-10T00:33:26.9808426Z s3-bucket: gha-artifacts 2025-10-10T00:33:26.9808841Z region: us-east-1 2025-10-10T00:33:26.9809181Z env: 2025-10-10T00:33:26.9809512Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:26.9810176Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:26.9811155Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:26.9812064Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:26.9813603Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:26.9814975Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:26.9815419Z AWS_REGION: us-east-1 2025-10-10T00:33:26.9815901Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:26.9816478Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:26.9824982Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:26.9825781Z ##[endgroup] 2025-10-10T00:33:27.4317448Z (node:1428669) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-10-10T00:33:27.4318372Z 2025-10-10T00:33:27.4318727Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-10-10T00:33:27.4319734Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-10-10T00:33:27.4320723Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-10-10T00:33:27.5676390Z Found 1 objects with prefix pytorch/pytorch/18392306192/td_results/ 2025-10-10T00:33:27.5677664Z Starting download (1/1): /var/home/pytorchci/actions-runner/_work/pytorch/pytorch/td_results.json 2025-10-10T00:33:27.7605714Z Finished download (1/1): /var/home/pytorchci/actions-runner/_work/pytorch/pytorch/td_results.json 2025-10-10T00:33:27.7616983Z Artifact download has finished successfully 2025-10-10T00:33:27.8190827Z ##[group]Run mkdir -p .additional_ci_files 2025-10-10T00:33:27.8191489Z mkdir -p .additional_ci_files 2025-10-10T00:33:27.8192283Z mv td_results.json .additional_ci_files/td_results.json || true 2025-10-10T00:33:27.8246522Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:27.8246870Z env: 2025-10-10T00:33:27.8247082Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:27.8247484Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:27.8248067Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:27.8248594Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:27.8249752Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:27.8250571Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:27.8250841Z AWS_REGION: us-east-1 2025-10-10T00:33:27.8251190Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:27.8251663Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:27.8261939Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:27.8262423Z ##[endgroup] 2025-10-10T00:33:27.8478855Z ##[group]Run .github/scripts/parse_ref.py 2025-10-10T00:33:27.8479564Z .github/scripts/parse_ref.py 2025-10-10T00:33:27.8527532Z shell: /usr/bin/bash -e {0} 2025-10-10T00:33:27.8528028Z env: 2025-10-10T00:33:27.8528430Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:27.8529195Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:27.8530353Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:27.8531401Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:27.8533116Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:27.8534662Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:27.8535169Z AWS_REGION: us-east-1 2025-10-10T00:33:27.8535727Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:27.8536430Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:27.8546102Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:27.8546557Z ##[endgroup] 2025-10-10T00:33:27.8990086Z Setting output branch=main 2025-10-10T00:33:27.9226815Z Prepare all required actions 2025-10-10T00:33:27.9227566Z Getting action download info 2025-10-10T00:33:28.0957819Z ##[group]Run ./.github/actions/filter-test-configs 2025-10-10T00:33:28.0958356Z with: 2025-10-10T00:33:28.0959016Z github-token: *** 2025-10-10T00:33:28.0961316Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.2"}]} 2025-10-10T00:33:28.0964321Z job-name: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:28.0964999Z env: 2025-10-10T00:33:28.0965404Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:28.0966077Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:28.0967048Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:28.0967939Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:28.0969452Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:28.0970829Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:28.0971296Z AWS_REGION: us-east-1 2025-10-10T00:33:28.0971812Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:28.0972397Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:28.0980923Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:28.0981326Z ##[endgroup] 2025-10-10T00:33:28.1039753Z ##[group]Run nick-fields/retry@v3.0.0 2025-10-10T00:33:28.1040235Z with: 2025-10-10T00:33:28.1040582Z shell: bash 2025-10-10T00:33:28.1040949Z timeout_minutes: 10 2025-10-10T00:33:28.1041333Z max_attempts: 5 2025-10-10T00:33:28.1041709Z retry_wait_seconds: 30 2025-10-10T00:33:28.1042937Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-10-10T00:33:28.1044192Z polling_interval_seconds: 1 2025-10-10T00:33:28.1044712Z warning_on_retry: true 2025-10-10T00:33:28.1045204Z continue_on_error: false 2025-10-10T00:33:28.1045675Z env: 2025-10-10T00:33:28.1046077Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:28.1046871Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:28.1048014Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:28.1049079Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:28.1050861Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:28.1052251Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:28.1052709Z AWS_REGION: us-east-1 2025-10-10T00:33:28.1053195Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:28.1053810Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:28.1062328Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:28.1062913Z GITHUB_TOKEN: *** 2025-10-10T00:33:28.1063295Z ##[endgroup] 2025-10-10T00:33:28.2095605Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-10-10T00:33:28.4864618Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T00:33:28.6591444Z Requirement already satisfied: requests==2.27.1 in /var/home/pytorchci/.local/lib/python3.10/site-packages (2.27.1) 2025-10-10T00:33:28.6595761Z Requirement already satisfied: pyyaml==6.0.2 in /var/home/pytorchci/.local/lib/python3.10/site-packages (6.0.2) 2025-10-10T00:33:28.6680299Z Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests==2.27.1) (2020.6.20) 2025-10-10T00:33:28.6683751Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests==2.27.1) (1.26.5) 2025-10-10T00:33:28.6691578Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests==2.27.1) (3.3) 2025-10-10T00:33:28.6701200Z Requirement already satisfied: charset-normalizer~=2.0.0 in /var/home/pytorchci/.local/lib/python3.10/site-packages (from requests==2.27.1) (2.0.12) 2025-10-10T00:33:29.2098441Z Command completed after 1 attempt(s). 2025-10-10T00:33:29.2233759Z ##[group]Run set -x 2025-10-10T00:33:29.2234405Z set -x 2025-10-10T00:33:29.2234826Z  2025-10-10T00:33:29.2235545Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-10-10T00:33:29.2236422Z # in runner workspace 2025-10-10T00:33:29.2237174Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-10-10T00:33:29.2296961Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:29.2297756Z env: 2025-10-10T00:33:29.2298197Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.2299022Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.2300254Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.2301401Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.2303264Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.2304966Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.2305521Z AWS_REGION: us-east-1 2025-10-10T00:33:29.2306114Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.2306828Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.2317223Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.2317726Z ##[endgroup] 2025-10-10T00:33:29.2419116Z + python3 /var/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-10-10T00:33:29.2765397Z Setting output branch=main 2025-10-10T00:33:29.2901685Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-10-10T00:33:29.2902475Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-10-10T00:33:29.2903113Z echo "Job name: ${JOB_NAME}" 2025-10-10T00:33:29.2903650Z  2025-10-10T00:33:29.2904350Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-10-10T00:33:29.2905259Z # in runner workspace 2025-10-10T00:33:29.2906039Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-10-10T00:33:29.2906912Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-10-10T00:33:29.2907538Z  --job-name "${JOB_NAME}" \ 2025-10-10T00:33:29.2910427Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.2"}]}" \ 2025-10-10T00:33:29.2913332Z  --selected-test-configs "" \ 2025-10-10T00:33:29.2913957Z  --pr-number "${PR_NUMBER}" \ 2025-10-10T00:33:29.2914690Z  --tag "${TAG}" \ 2025-10-10T00:33:29.2915252Z  --event-name "${EVENT_NAME}" \ 2025-10-10T00:33:29.2915868Z  --schedule "${SCHEDULE}" \ 2025-10-10T00:33:29.2916423Z  --branch "${HEAD_BRANCH}" 2025-10-10T00:33:29.2976018Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:29.2976758Z env: 2025-10-10T00:33:29.2977175Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.2977966Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.2979126Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.2980206Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.2982000Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.2984028Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.2984566Z AWS_REGION: us-east-1 2025-10-10T00:33:29.2985148Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.2985841Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.2996566Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.2997329Z GITHUB_TOKEN: *** 2025-10-10T00:33:29.2998078Z JOB_NAME: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:29.2998901Z PR_NUMBER: 2025-10-10T00:33:29.2999310Z TAG: 2025-10-10T00:33:29.2999697Z EVENT_NAME: push 2025-10-10T00:33:29.3000122Z SCHEDULE: 2025-10-10T00:33:29.3000524Z HEAD_BRANCH: main 2025-10-10T00:33:29.3000948Z ##[endgroup] 2025-10-10T00:33:29.3102215Z Workflow: rocm 2025-10-10T00:33:29.3103051Z Job name: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:29.6717913Z Setting output keep-going=True 2025-10-10T00:33:29.6718698Z Setting output ci-verbose-test-logs=False 2025-10-10T00:33:29.6719443Z Setting output ci-test-showlocals=False 2025-10-10T00:33:29.6720104Z Setting output ci-no-test-timeout=False 2025-10-10T00:33:29.6720744Z Setting output ci-no-td=False 2025-10-10T00:33:29.6721341Z Setting output ci-td-distributed=False 2025-10-10T00:33:29.6721956Z Setting output is-unstable=False 2025-10-10T00:33:29.6722554Z Setting output reenabled-issues= 2025-10-10T00:33:29.6725615Z Setting output test-matrix={"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.2"}]} 2025-10-10T00:33:29.6728724Z Setting output is-test-matrix-empty=False 2025-10-10T00:33:29.6973488Z ##[group]Run echo "Filtered matrix:" 2025-10-10T00:33:29.6974146Z echo "Filtered matrix:" 2025-10-10T00:33:29.6976999Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.2"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.2"}]}" 2025-10-10T00:33:29.6979848Z  2025-10-10T00:33:29.6980240Z echo 2025-10-10T00:33:29.6980752Z echo "Is the current job unstable? False" 2025-10-10T00:33:29.6981363Z  2025-10-10T00:33:29.6981739Z echo 2025-10-10T00:33:29.6982202Z echo "Is keep-going label set? True" 2025-10-10T00:33:29.6982813Z  2025-10-10T00:33:29.6983187Z echo 2025-10-10T00:33:29.6983619Z echo "Reenabled issues? " 2025-10-10T00:33:29.7042475Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:29.7043189Z env: 2025-10-10T00:33:29.7043609Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.7044890Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.7046092Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.7047195Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.7049009Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.7050637Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.7051171Z AWS_REGION: us-east-1 2025-10-10T00:33:29.7051780Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.7052860Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.7063147Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.7063635Z ##[endgroup] 2025-10-10T00:33:29.7156159Z Filtered matrix: 2025-10-10T00:33:29.7159026Z {include: [{config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.2}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.2}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.2}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.2}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.2}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.2}]} 2025-10-10T00:33:29.7161652Z 2025-10-10T00:33:29.7161934Z Is the current job unstable? False 2025-10-10T00:33:29.7162325Z 2025-10-10T00:33:29.7162554Z Is keep-going label set? True 2025-10-10T00:33:29.7162912Z 2025-10-10T00:33:29.7163104Z Reenabled issues? 2025-10-10T00:33:29.7238117Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-10-10T00:33:29.7239153Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-10-10T00:33:29.7295376Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:29.7296096Z env: 2025-10-10T00:33:29.7296523Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.7297366Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.7298578Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.7299730Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.7301598Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.7303248Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.7303855Z AWS_REGION: us-east-1 2025-10-10T00:33:29.7304469Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.7305186Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.7315571Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.7316080Z JOB_TIMEOUT: 300 2025-10-10T00:33:29.7316524Z ##[endgroup] 2025-10-10T00:33:29.7464333Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:33:29.7465364Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:33:29.7466220Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:33:29.7522455Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:33:29.7523144Z env: 2025-10-10T00:33:29.7523540Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.7524304Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.7525432Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.7526462Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.7528291Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.7529836Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.7530348Z AWS_REGION: us-east-1 2025-10-10T00:33:29.7530915Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.7531567Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.7541240Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.7541711Z ##[endgroup] 2025-10-10T00:33:29.7811604Z ##[group]Run set -x 2025-10-10T00:33:29.7812309Z set -x 2025-10-10T00:33:29.7812800Z  2025-10-10T00:33:29.7813373Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-10-10T00:33:29.7814172Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-10-10T00:33:29.7814855Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-10-10T00:33:29.7815489Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-10-10T00:33:29.7816039Z else 2025-10-10T00:33:29.7816967Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-10-10T00:33:29.7817508Z fi 2025-10-10T00:33:29.7817869Z  2025-10-10T00:33:29.7818447Z # detached container should get cleaned up by teardown_ec2_linux 2025-10-10T00:33:29.7819400Z # TODO: Stop building test binaries as part of the build phase 2025-10-10T00:33:29.7820363Z # Used for GPU_FLAG since that doesn't play nice 2025-10-10T00:33:29.7821199Z # shellcheck disable=SC2086,SC2090 2025-10-10T00:33:29.7821890Z container_name=$(docker run \ 2025-10-10T00:33:29.7822537Z  ${GPU_FLAG:-} \ 2025-10-10T00:33:29.7823062Z  -e BUILD_ENVIRONMENT \ 2025-10-10T00:33:29.7823579Z  -e PR_NUMBER \ 2025-10-10T00:33:29.7824073Z  -e GITHUB_ACTIONS \ 2025-10-10T00:33:29.7824583Z  -e GITHUB_REPOSITORY \ 2025-10-10T00:33:29.7825101Z  -e GITHUB_WORKFLOW \ 2025-10-10T00:33:29.7825591Z  -e GITHUB_JOB \ 2025-10-10T00:33:29.7826075Z  -e GITHUB_RUN_ID \ 2025-10-10T00:33:29.7826553Z  -e GITHUB_RUN_NUMBER \ 2025-10-10T00:33:29.7827053Z  -e GITHUB_RUN_ATTEMPT \ 2025-10-10T00:33:29.7827554Z  -e JOB_ID \ 2025-10-10T00:33:29.7828002Z  -e JOB_NAME \ 2025-10-10T00:33:29.7828455Z  -e BASE_SHA \ 2025-10-10T00:33:29.7828889Z  -e BRANCH \ 2025-10-10T00:33:29.7829316Z  -e SHA1 \ 2025-10-10T00:33:29.7829769Z  -e AWS_DEFAULT_REGION \ 2025-10-10T00:33:29.7830274Z  -e IN_WHEEL_TEST \ 2025-10-10T00:33:29.7830745Z  -e SHARD_NUMBER \ 2025-10-10T00:33:29.7831220Z  -e TEST_CONFIG \ 2025-10-10T00:33:29.7831690Z  -e NUM_TEST_SHARDS \ 2025-10-10T00:33:29.7832188Z  -e REENABLED_ISSUES \ 2025-10-10T00:33:29.7832714Z  -e CONTINUE_THROUGH_ERROR \ 2025-10-10T00:33:29.7833250Z  -e VERBOSE_TEST_LOGS \ 2025-10-10T00:33:29.7833752Z  -e TEST_SHOWLOCALS \ 2025-10-10T00:33:29.7834587Z  -e NO_TEST_TIMEOUT \ 2025-10-10T00:33:29.7835060Z  -e NO_TD \ 2025-10-10T00:33:29.7835548Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-10-10T00:33:29.7836328Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-10-10T00:33:29.7837071Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-10-10T00:33:29.7837762Z  -e TESTS_TO_INCLUDE \ 2025-10-10T00:33:29.7838310Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-10-10T00:33:29.7838854Z  -e DASHBOARD_TAG \ 2025-10-10T00:33:29.7839491Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-10-10T00:33:29.7840215Z  --ulimit stack=10485760:83886080 \ 2025-10-10T00:33:29.7840768Z  --ulimit core=0 \ 2025-10-10T00:33:29.7841350Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-10-10T00:33:29.7842054Z  --security-opt seccomp=unconfined \ 2025-10-10T00:33:29.7842655Z  --cap-add=SYS_PTRACE \ 2025-10-10T00:33:29.7843204Z  --shm-size="8g" \ 2025-10-10T00:33:29.7843654Z  --tty \ 2025-10-10T00:33:29.7844069Z  --detach \ 2025-10-10T00:33:29.7844536Z  --name="${container_name}" \ 2025-10-10T00:33:29.7845083Z  --user jenkins \ 2025-10-10T00:33:29.7845687Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-10-10T00:33:29.7846416Z  -w /var/lib/jenkins/workspace \ 2025-10-10T00:33:29.7847449Z  "${DOCKER_IMAGE}" 2025-10-10T00:33:29.7847991Z ) 2025-10-10T00:33:29.7848535Z # save container name for later step 2025-10-10T00:33:29.7849375Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-10-10T00:33:29.7850875Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-10-10T00:33:29.7852836Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-10-10T00:33:29.7905246Z shell: /usr/bin/bash -e {0} 2025-10-10T00:33:29.7905745Z env: 2025-10-10T00:33:29.7906135Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:33:29.7906905Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T00:33:29.7908040Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T00:33:29.7909081Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T00:33:29.7910808Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T00:33:29.7912369Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:33:29.7912863Z AWS_REGION: us-east-1 2025-10-10T00:33:29.7913428Z AWS_ACCESS_KEY_ID: *** 2025-10-10T00:33:29.7914298Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T00:33:29.7924123Z AWS_SESSION_TOKEN: *** 2025-10-10T00:33:29.7924662Z BUILD_ENVIRONMENT: linux-jammy-rocm-py3.10 2025-10-10T00:33:29.7925253Z PR_NUMBER: 2025-10-10T00:33:29.7925707Z GITHUB_REPOSITORY: pytorch/pytorch 2025-10-10T00:33:29.7926253Z GITHUB_WORKFLOW: rocm 2025-10-10T00:33:29.7926705Z GITHUB_JOB: test 2025-10-10T00:33:29.7927130Z GITHUB_RUN_ID: 18392306192 2025-10-10T00:33:29.7927611Z GITHUB_RUN_NUMBER: 31941 2025-10-10T00:33:29.7928069Z GITHUB_RUN_ATTEMPT: 1 2025-10-10T00:33:29.7928505Z JOB_ID: 52406492265 2025-10-10T00:33:29.7929167Z JOB_NAME: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:29.7929946Z BRANCH: main 2025-10-10T00:33:29.7930402Z SHA1: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:29.7931052Z BASE_SHA: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:29.7931631Z TEST_CONFIG: default 2025-10-10T00:33:29.7932054Z SHARD_NUMBER: 1 2025-10-10T00:33:29.7932451Z NUM_TEST_SHARDS: 6 2025-10-10T00:33:29.7932875Z REENABLED_ISSUES: 2025-10-10T00:33:29.7933298Z CONTINUE_THROUGH_ERROR: True 2025-10-10T00:33:29.7933781Z VERBOSE_TEST_LOGS: False 2025-10-10T00:33:29.7934239Z TEST_SHOWLOCALS: False 2025-10-10T00:33:29.7934687Z NO_TEST_TIMEOUT: False 2025-10-10T00:33:29.7935111Z NO_TD: False 2025-10-10T00:33:29.7936342Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:33:29.7937697Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-10-10T00:33:29.7938257Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-10-10T00:33:29.7938775Z TESTS_TO_INCLUDE: 2025-10-10T00:33:29.7939181Z DASHBOARD_TAG: 2025-10-10T00:33:29.7939795Z HUGGING_FACE_HUB_TOKEN: *** 2025-10-10T00:33:29.7940271Z ##[endgroup] 2025-10-10T00:33:29.8024885Z + [[ default == \m\u\l\t\i\g\p\u ]] 2025-10-10T00:33:29.8025592Z + [[ linux-jammy-rocm-py3.10 == *onnx* ]] 2025-10-10T00:33:29.8026249Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-10-10T00:33:29.8045972Z +++ nproc --ignore=2 2025-10-10T00:33:29.8083983Z ++ docker run --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=126 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/var/home/pytorchci/actions-runner/_work/_temp/github_env_18392306192 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_18392306192 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /var/home/pytorchci/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:33:30.0212936Z + container_name=496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T00:33:30.0214348Z + echo CONTAINER_NAME=496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T00:33:30.0216343Z + docker exec -t 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-10-10T00:33:46.1265401Z Processing ./dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl 2025-10-10T00:33:46.6855000Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (3.18.0) 2025-10-10T00:33:46.6858340Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (4.12.2) 2025-10-10T00:33:46.6860270Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (1.13.3) 2025-10-10T00:33:46.6865705Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (2.8.8) 2025-10-10T00:33:46.6867794Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (3.1.6) 2025-10-10T00:33:46.6872599Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636) (2025.9.0) 2025-10-10T00:33:46.7199622Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+git344e636) (1.3.0) 2025-10-10T00:33:46.7247198Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+git344e636) (3.0.3) 2025-10-10T00:33:47.0662851Z Installing collected packages: torch 2025-10-10T00:33:57.0002610Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. 2025-10-10T00:33:57.0004428Z helion 0.1.6 requires filecheck, which is not installed. 2025-10-10T00:33:57.0005357Z Successfully installed torch-2.10.0a0+git344e636 2025-10-10T00:33:57.0507118Z + export TERM=vt100 2025-10-10T00:33:57.0507685Z + TERM=vt100 2025-10-10T00:33:57.0515370Z ++ dirname .ci/pytorch/test.sh 2025-10-10T00:33:57.0546798Z + source .ci/pytorch/common.sh 2025-10-10T00:33:57.0558551Z +++ dirname .ci/pytorch/common.sh 2025-10-10T00:33:57.0589798Z ++ source .ci/pytorch/common_utils.sh 2025-10-10T00:33:57.0590527Z +++ declare -f -t trap_add 2025-10-10T00:33:57.0603106Z ++ set -ex -o pipefail 2025-10-10T00:33:57.0603887Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-10-10T00:33:57.0604591Z ++ unset HIP_PLATFORM 2025-10-10T00:33:57.0605168Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-10-10T00:33:57.0605808Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-10-10T00:33:57.0606391Z ++ BUILD_TEST_LIBTORCH=0 2025-10-10T00:33:57.0613600Z ++ dirname .ci/pytorch/test.sh 2025-10-10T00:33:57.0645541Z + source .ci/pytorch/common-build.sh 2025-10-10T00:33:57.0648362Z ++ [[ linux-jammy-rocm-py3.10 != *win-* ]] 2025-10-10T00:33:57.0668789Z ++++ dirname .ci/pytorch/common-build.sh 2025-10-10T00:33:57.0698765Z +++ cd .ci/pytorch 2025-10-10T00:33:57.0699557Z +++ pwd -P 2025-10-10T00:33:57.0706995Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-10-10T00:33:57.0707871Z ++ [[ linux-jammy-rocm-py3.10 == *-pch* ]] 2025-10-10T00:33:57.0708547Z ++ which sccache 2025-10-10T00:33:57.0742712Z ++ [[ -z '' ]] 2025-10-10T00:33:57.0743684Z ++ unset SCCACHE_BUCKET 2025-10-10T00:33:57.0744149Z ++ unset SCCACHE_REGION 2025-10-10T00:33:57.0744610Z ++ sccache --stop-server 2025-10-10T00:33:57.0812644Z ++ true 2025-10-10T00:33:57.0812925Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-10-10T00:33:57.0847802Z ++ trap_add sccache_epilogue EXIT 2025-10-10T00:33:57.0848528Z ++ trap_add_cmd=sccache_epilogue 2025-10-10T00:33:57.0849028Z ++ shift 2025-10-10T00:33:57.0849429Z ++ for trap_add_name in "$@" 2025-10-10T00:33:57.0867684Z ++++ trap -p EXIT 2025-10-10T00:33:57.0875402Z +++ eval 'extract_trap_cmd ' 2025-10-10T00:33:57.0876097Z ++++ extract_trap_cmd 2025-10-10T00:33:57.0876550Z ++++ printf '%s\n' '' 2025-10-10T00:33:57.0877024Z +++ printf '%s\n' sccache_epilogue 2025-10-10T00:33:57.0884792Z ++ trap -- ' 2025-10-10T00:33:57.0885282Z sccache_epilogue' EXIT 2025-10-10T00:33:57.0885721Z ++ [[ -n '' ]] 2025-10-10T00:33:57.0886137Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-10-10T00:33:57.0886768Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-10-10T00:33:57.0887374Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-10-10T00:33:57.0887806Z ++ sccache --start-server 2025-10-10T00:33:57.0939190Z sccache: Starting the server... 2025-10-10T00:33:57.1656020Z sccache: Listening on address 127.0.0.1:4226 2025-10-10T00:33:57.1684071Z ++ sccache --zero-stats 2025-10-10T00:33:57.1744563Z Statistics zeroed. 2025-10-10T00:33:57.1754352Z ++ which ccache 2025-10-10T00:33:57.1785537Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-10-10T00:33:57.1786300Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-10-10T00:33:57.1786898Z + echo 'Environment variables:' 2025-10-10T00:33:57.1787437Z Environment variables: 2025-10-10T00:33:57.1787866Z + env 2025-10-10T00:33:57.1815291Z GITHUB_WORKSPACE=/var/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-10-10T00:33:57.1816217Z CONTINUE_THROUGH_ERROR=True 2025-10-10T00:33:57.1816802Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-10-10T00:33:57.1817475Z HOSTNAME=gpud501.jax.cs.cpe.ice.amd.com 2025-10-10T00:33:57.1818707Z GITHUB_PATH=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.1819856Z GITHUB_ACTION=__run_2 2025-10-10T00:33:57.1820342Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-10-10T00:33:57.1820891Z GITHUB_RUN_NUMBER=31941 2025-10-10T00:33:57.1821353Z TEST_CONFIG=default 2025-10-10T00:33:57.1821828Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-10-10T00:33:57.1822414Z AWS_DEFAULT_REGION=us-east-1 2025-10-10T00:33:57.1822965Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-10-10T00:33:57.1823519Z GITHUB_REF_TYPE=branch 2025-10-10T00:33:57.1824057Z BASE_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.1824996Z HUGGING_FACE_HUB_TOKEN=*** 2025-10-10T00:33:57.1828738Z *** 2025-10-10T00:33:57.1829170Z GITHUB_REPOSITORY_ID=65600975 2025-10-10T00:33:57.1829699Z GITHUB_ACTIONS=true 2025-10-10T00:33:57.1830194Z SHA1=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.1830837Z GITHUB_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.1831730Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/rocm.yml@refs/heads/main 2025-10-10T00:33:57.1832518Z UCC_HOME=/usr 2025-10-10T00:33:57.1833009Z VERBOSE_TEST_LOGS=False 2025-10-10T00:33:57.1833473Z GITHUB_REF=refs/heads/main 2025-10-10T00:33:57.1833934Z SHARD_NUMBER=1 2025-10-10T00:33:57.1834499Z GITHUB_REF_PROTECTED=true 2025-10-10T00:33:57.1834968Z HOME=/var/lib/jenkins 2025-10-10T00:33:57.1835463Z GITHUB_API_URL=https://api.github.com 2025-10-10T00:33:57.1836573Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-10-10T00:33:57.1837116Z LANG=C.UTF-8 2025-10-10T00:33:57.1837608Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-10-10T00:33:57.1838240Z PYTORCH_TEST_WITH_ROCM=1 2025-10-10T00:33:57.1838705Z NUM_TEST_SHARDS=6 2025-10-10T00:33:57.1839114Z UCX_HOME=/usr 2025-10-10T00:33:57.1840182Z GITHUB_STATE=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.1841577Z JOB_NAME=linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:57.1842705Z MAGMA_HOME=/opt/rocm/magma 2025-10-10T00:33:57.1843756Z GITHUB_ENV=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.1845229Z GITHUB_EVENT_PATH=/var/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2025-10-10T00:33:57.1846128Z GITHUB_EVENT_NAME=push 2025-10-10T00:33:57.1846562Z DASHBOARD_TAG= 2025-10-10T00:33:57.1846977Z GITHUB_RUN_ID=18392306192 2025-10-10T00:33:57.1848110Z GITHUB_STEP_SUMMARY=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.1849329Z GITHUB_ACTOR=pytorchmergebot 2025-10-10T00:33:57.1849809Z PR_NUMBER= 2025-10-10T00:33:57.1850193Z GITHUB_RUN_ATTEMPT=1 2025-10-10T00:33:57.1850636Z ANACONDA_PYTHON_VERSION=3.10 2025-10-10T00:33:57.1870008Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-10-10T00:33:57.1870708Z TERM=vt100 2025-10-10T00:33:57.1871131Z INSTALLED_VISION=yes 2025-10-10T00:33:57.1871585Z BRANCH=main 2025-10-10T00:33:57.1872017Z OPENSSL_ROOT_DIR=/opt/openssl 2025-10-10T00:33:57.1872519Z TESTS_TO_INCLUDE= 2025-10-10T00:33:57.1873467Z GITHUB_ACTION_PATH=/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-10-10T00:33:57.1874680Z GITHUB_SERVER_URL=https://github.com 2025-10-10T00:33:57.1875318Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950 2025-10-10T00:33:57.1876024Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-10-10T00:33:57.1876643Z REENABLED_ISSUES= 2025-10-10T00:33:57.1877038Z SHLVL=1 2025-10-10T00:33:57.1877400Z MAX_JOBS=126 2025-10-10T00:33:57.1877807Z GITHUB_ACTOR_ID=97764156 2025-10-10T00:33:57.1878417Z GITHUB_WORKFLOW_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.1879071Z GITHUB_REF_NAME=main 2025-10-10T00:33:57.1879502Z ROCM_PATH=/opt/rocm 2025-10-10T00:33:57.1879909Z GITHUB_JOB=test 2025-10-10T00:33:57.1880323Z NO_TEST_TIMEOUT=False 2025-10-10T00:33:57.1880803Z GITHUB_REPOSITORY=pytorch/pytorch 2025-10-10T00:33:57.1881312Z LC_ALL=C.UTF-8 2025-10-10T00:33:57.1881718Z GITHUB_RETENTION_DAYS=90 2025-10-10T00:33:57.1882191Z OPENSSL_DIR=/opt/openssl 2025-10-10T00:33:57.1882656Z GITHUB_ACTION_REPOSITORY= 2025-10-10T00:33:57.1884339Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:33:57.1886040Z GITHUB_BASE_REF= 2025-10-10T00:33:57.1886437Z CI=true 2025-10-10T00:33:57.1886845Z GITHUB_REPOSITORY_OWNER=pytorch 2025-10-10T00:33:57.1887332Z JOB_ID=52406492265 2025-10-10T00:33:57.1887742Z GITHUB_HEAD_REF= 2025-10-10T00:33:57.1888138Z GITHUB_ACTION_REF= 2025-10-10T00:33:57.1888545Z TEST_SHOWLOCALS=False 2025-10-10T00:33:57.1888972Z GITHUB_WORKFLOW=rocm 2025-10-10T00:33:57.1889422Z DEBIAN_FRONTEND=noninteractive 2025-10-10T00:33:57.1890561Z GITHUB_OUTPUT=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.1891728Z NO_TD=False 2025-10-10T00:33:57.1892122Z OLDPWD=/var/lib/jenkins 2025-10-10T00:33:57.1892555Z _=/usr/bin/env 2025-10-10T00:33:57.1893128Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-10-10T00:33:57.2115392Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-10-10T00:33:57.2117024Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T00:33:57.2118031Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-10-10T00:33:57.2119019Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-10-10T00:33:57.2119768Z + BUILD_DIR=build 2025-10-10T00:33:57.2120226Z + BUILD_RENAMED_DIR=build_renamed 2025-10-10T00:33:57.2120758Z + BUILD_BIN_DIR=build/bin 2025-10-10T00:33:57.2121213Z + SHARD_NUMBER=1 2025-10-10T00:33:57.2121615Z + NUM_TEST_SHARDS=6 2025-10-10T00:33:57.2122392Z + export TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:33:57.2122949Z + TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:33:57.2123437Z + export VALGRIND=ON 2025-10-10T00:33:57.2123853Z + VALGRIND=ON 2025-10-10T00:33:57.2124303Z + [[ linux-jammy-rocm-py3.10 == *clang9* ]] 2025-10-10T00:33:57.2124897Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-10-10T00:33:57.2125436Z + detect_cuda_arch 2025-10-10T00:33:57.2125880Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-10-10T00:33:57.2126455Z + [[ linux-jammy-rocm-py3.10 == *s390x* ]] 2025-10-10T00:33:57.2126975Z + [[ 0 == \1 ]] 2025-10-10T00:33:57.2127367Z + [[ True == \1 ]] 2025-10-10T00:33:57.2127809Z + [[ linux-jammy-rocm-py3.10 != *bazel* ]] 2025-10-10T00:33:57.2128400Z ++ realpath build/custom_test_artifacts 2025-10-10T00:33:57.2163435Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-10-10T00:33:57.2164348Z + [[ -n '' ]] 2025-10-10T00:33:57.2164809Z + echo 'Environment variables' 2025-10-10T00:33:57.2165343Z Environment variables 2025-10-10T00:33:57.2165766Z + env 2025-10-10T00:33:57.2202743Z GITHUB_WORKSPACE=/var/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-10-10T00:33:57.2203683Z CONTINUE_THROUGH_ERROR=True 2025-10-10T00:33:57.2204274Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-10-10T00:33:57.2204922Z HOSTNAME=gpud501.jax.cs.cpe.ice.amd.com 2025-10-10T00:33:57.2206102Z GITHUB_PATH=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.2207251Z GITHUB_ACTION=__run_2 2025-10-10T00:33:57.2207730Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-10-10T00:33:57.2208251Z GITHUB_RUN_NUMBER=31941 2025-10-10T00:33:57.2208697Z TEST_CONFIG=default 2025-10-10T00:33:57.2209144Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-10-10T00:33:57.2209690Z AWS_DEFAULT_REGION=us-east-1 2025-10-10T00:33:57.2210262Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-10-10T00:33:57.2210819Z GITHUB_REF_TYPE=branch 2025-10-10T00:33:57.2211329Z BASE_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.2212163Z HUGGING_FACE_HUB_TOKEN=*** 2025-10-10T00:33:57.2212849Z *** 2025-10-10T00:33:57.2213245Z GITHUB_REPOSITORY_ID=65600975 2025-10-10T00:33:57.2213754Z GITHUB_ACTIONS=true 2025-10-10T00:33:57.2214241Z SHA1=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.2214923Z GITHUB_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.2215793Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/rocm.yml@refs/heads/main 2025-10-10T00:33:57.2216609Z UCC_HOME=/usr 2025-10-10T00:33:57.2217032Z TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:33:57.2217528Z VERBOSE_TEST_LOGS=False 2025-10-10T00:33:57.2217991Z GITHUB_REF=refs/heads/main 2025-10-10T00:33:57.2218447Z SHARD_NUMBER=1 2025-10-10T00:33:57.2218857Z GITHUB_REF_PROTECTED=true 2025-10-10T00:33:57.2219309Z HOME=/var/lib/jenkins 2025-10-10T00:33:57.2219800Z GITHUB_API_URL=https://api.github.com 2025-10-10T00:33:57.2220387Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-10-10T00:33:57.2220888Z LANG=C.UTF-8 2025-10-10T00:33:57.2221376Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-10-10T00:33:57.2221988Z PYTORCH_TEST_WITH_ROCM=1 2025-10-10T00:33:57.2222436Z NUM_TEST_SHARDS=6 2025-10-10T00:33:57.2222833Z UCX_HOME=/usr 2025-10-10T00:33:57.2223876Z GITHUB_STATE=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.2225268Z JOB_NAME=linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T00:33:57.2226524Z MAGMA_HOME=/opt/rocm/magma 2025-10-10T00:33:57.2227606Z GITHUB_ENV=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.2229076Z GITHUB_EVENT_PATH=/var/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2025-10-10T00:33:57.2229970Z GITHUB_EVENT_NAME=push 2025-10-10T00:33:57.2230422Z DASHBOARD_TAG= 2025-10-10T00:33:57.2230833Z GITHUB_RUN_ID=18392306192 2025-10-10T00:33:57.2231991Z GITHUB_STEP_SUMMARY=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.2233607Z GITHUB_ACTOR=pytorchmergebot 2025-10-10T00:33:57.2234263Z PR_NUMBER= 2025-10-10T00:33:57.2234662Z GITHUB_RUN_ATTEMPT=1 2025-10-10T00:33:57.2235097Z VALGRIND=ON 2025-10-10T00:33:57.2235518Z ANACONDA_PYTHON_VERSION=3.10 2025-10-10T00:33:57.2236104Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-10-10T00:33:57.2236711Z TERM=vt100 2025-10-10T00:33:57.2237098Z INSTALLED_VISION=yes 2025-10-10T00:33:57.2237566Z BRANCH=main 2025-10-10T00:33:57.2237978Z OPENSSL_ROOT_DIR=/opt/openssl 2025-10-10T00:33:57.2238454Z TESTS_TO_INCLUDE= 2025-10-10T00:33:57.2239367Z GITHUB_ACTION_PATH=/var/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-10-10T00:33:57.2240399Z GITHUB_SERVER_URL=https://github.com 2025-10-10T00:33:57.2240976Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950 2025-10-10T00:33:57.2241602Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-10-10T00:33:57.2242211Z REENABLED_ISSUES= 2025-10-10T00:33:57.2242606Z SHLVL=1 2025-10-10T00:33:57.2242953Z MAX_JOBS=126 2025-10-10T00:33:57.2243358Z GITHUB_ACTOR_ID=97764156 2025-10-10T00:33:57.2243956Z GITHUB_WORKFLOW_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:33:57.2244705Z GITHUB_REF_NAME=main 2025-10-10T00:33:57.2245130Z ROCM_PATH=/opt/rocm 2025-10-10T00:33:57.2245550Z GITHUB_JOB=test 2025-10-10T00:33:57.2245956Z NO_TEST_TIMEOUT=False 2025-10-10T00:33:57.2246435Z GITHUB_REPOSITORY=pytorch/pytorch 2025-10-10T00:33:57.2246940Z LC_ALL=C.UTF-8 2025-10-10T00:33:57.2247351Z GITHUB_RETENTION_DAYS=90 2025-10-10T00:33:57.2247814Z OPENSSL_DIR=/opt/openssl 2025-10-10T00:33:57.2248283Z GITHUB_ACTION_REPOSITORY= 2025-10-10T00:33:57.2249947Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:33:57.2251647Z GITHUB_BASE_REF= 2025-10-10T00:33:57.2252050Z CI=true 2025-10-10T00:33:57.2252451Z GITHUB_REPOSITORY_OWNER=pytorch 2025-10-10T00:33:57.2252930Z JOB_ID=52406492265 2025-10-10T00:33:57.2253328Z GITHUB_HEAD_REF= 2025-10-10T00:33:57.2253727Z GITHUB_ACTION_REF= 2025-10-10T00:33:57.2254137Z TEST_SHOWLOCALS=False 2025-10-10T00:33:57.2254577Z GITHUB_WORKFLOW=rocm 2025-10-10T00:33:57.2255026Z DEBIAN_FRONTEND=noninteractive 2025-10-10T00:33:57.2256226Z GITHUB_OUTPUT=/var/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_0251309a-b1cf-4aac-98d9-a22f5fbd0ce0 2025-10-10T00:33:57.2257568Z NO_TD=False 2025-10-10T00:33:57.2257953Z OLDPWD=/var/lib/jenkins 2025-10-10T00:33:57.2258374Z _=/usr/bin/env 2025-10-10T00:33:57.2258774Z + echo 'Testing pytorch' 2025-10-10T00:33:57.2259215Z Testing pytorch 2025-10-10T00:33:57.2259633Z + export LANG=C.UTF-8 2025-10-10T00:33:57.2260049Z + LANG=C.UTF-8 2025-10-10T00:33:57.2260413Z + PR_NUMBER= 2025-10-10T00:33:57.2260818Z + [[ default == \d\e\f\a\u\l\t ]] 2025-10-10T00:33:57.2261340Z + export CUDA_VISIBLE_DEVICES=0 2025-10-10T00:33:57.2261834Z + CUDA_VISIBLE_DEVICES=0 2025-10-10T00:33:57.2262302Z + export HIP_VISIBLE_DEVICES=0 2025-10-10T00:33:57.2262787Z + HIP_VISIBLE_DEVICES=0 2025-10-10T00:33:57.2263273Z + [[ default == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-10-10T00:33:57.2263822Z + [[ default == \s\l\o\w ]] 2025-10-10T00:33:57.2264371Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-10-10T00:33:57.2265021Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-10-10T00:33:57.2265924Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-10-10T00:33:57.2266541Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-10-10T00:33:57.2267140Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-10-10T00:33:57.2267688Z + [[ default == *crossref* ]] 2025-10-10T00:33:57.2268198Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-10-10T00:33:57.2268746Z + export VALGRIND=OFF 2025-10-10T00:33:57.2269184Z + VALGRIND=OFF 2025-10-10T00:33:57.2269579Z + rocminfo 2025-10-10T00:33:57.2379414Z ROCk module version 6.10.5 is loaded 2025-10-10T00:33:57.4383463Z ===================== 2025-10-10T00:33:57.4384160Z HSA System Attributes 2025-10-10T00:33:57.4384774Z ===================== 2025-10-10T00:33:57.4385310Z Runtime Version: 1.18 2025-10-10T00:33:57.4385884Z Runtime Ext Version: 1.11 2025-10-10T00:33:57.4386439Z System Timestamp Freq.: 1000.000000MHz 2025-10-10T00:33:57.4387285Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-10-10T00:33:57.4388252Z Machine Model: LARGE 2025-10-10T00:33:57.4388984Z System Endianness: LITTLE 2025-10-10T00:33:57.4389641Z Mwaitx: DISABLED 2025-10-10T00:33:57.4390245Z XNACK enabled: NO 2025-10-10T00:33:57.4390752Z DMAbuf Support: YES 2025-10-10T00:33:57.4391255Z VMM Support: YES 2025-10-10T00:33:57.4391555Z 2025-10-10T00:33:57.4391725Z ========== 2025-10-10T00:33:57.4392222Z HSA Agents 2025-10-10T00:33:57.4392683Z ========== 2025-10-10T00:33:57.4393086Z ******* 2025-10-10T00:33:57.4393505Z Agent 1 2025-10-10T00:33:57.4393937Z ******* 2025-10-10T00:33:57.4394726Z Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.4395434Z Uuid: CPU-XX 2025-10-10T00:33:57.4396155Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.4396888Z Vendor Name: CPU 2025-10-10T00:33:57.4397562Z Feature: None specified 2025-10-10T00:33:57.4398246Z Profile: FULL_PROFILE 2025-10-10T00:33:57.4398927Z Float Round Mode: NEAR 2025-10-10T00:33:57.4399623Z Max Queue Number: 0(0x0) 2025-10-10T00:33:57.4400309Z Queue Min Size: 0(0x0) 2025-10-10T00:33:57.4400996Z Queue Max Size: 0(0x0) 2025-10-10T00:33:57.4401667Z Queue Type: MULTI 2025-10-10T00:33:57.4402297Z Node: 0 2025-10-10T00:33:57.4402931Z Device Type: CPU 2025-10-10T00:33:57.4403529Z Cache Info: 2025-10-10T00:33:57.4404044Z L1: 32768(0x8000) KB 2025-10-10T00:33:57.4404664Z Chip ID: 0(0x0) 2025-10-10T00:33:57.4405322Z ASIC Revision: 0(0x0) 2025-10-10T00:33:57.4406008Z Cacheline Size: 64(0x40) 2025-10-10T00:33:57.4406690Z Max Clock Freq. (MHz): 2000 2025-10-10T00:33:57.4407340Z BDFID: 0 2025-10-10T00:33:57.4408007Z Internal Node ID: 0 2025-10-10T00:33:57.4408686Z Compute Unit: 64 2025-10-10T00:33:57.4409353Z SIMDs per CU: 0 2025-10-10T00:33:57.4410018Z Shader Engines: 0 2025-10-10T00:33:57.4410706Z Shader Arrs. per Eng.: 0 2025-10-10T00:33:57.4412054Z WatchPts on Addr. Ranges:1 2025-10-10T00:33:57.4412727Z Memory Properties: 2025-10-10T00:33:57.4413191Z Features: None 2025-10-10T00:33:57.4413664Z Pool Info: 2025-10-10T00:33:57.4414120Z Pool 1 2025-10-10T00:33:57.4414695Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4415384Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:33:57.4416065Z Allocatable: TRUE 2025-10-10T00:33:57.4417175Z Alloc Granule: 4KB 2025-10-10T00:33:57.4417912Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4418696Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4419411Z Accessible by all: TRUE 2025-10-10T00:33:57.4420027Z Pool 2 2025-10-10T00:33:57.4420609Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4421285Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:33:57.4421950Z Allocatable: TRUE 2025-10-10T00:33:57.4422635Z Alloc Granule: 4KB 2025-10-10T00:33:57.4423351Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4424079Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4424796Z Accessible by all: TRUE 2025-10-10T00:33:57.4425399Z Pool 3 2025-10-10T00:33:57.4425957Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-10-10T00:33:57.4426610Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:33:57.4427262Z Allocatable: TRUE 2025-10-10T00:33:57.4427955Z Alloc Granule: 4KB 2025-10-10T00:33:57.4428675Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4429397Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4430103Z Accessible by all: TRUE 2025-10-10T00:33:57.4430710Z Pool 4 2025-10-10T00:33:57.4431258Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4432203Z Size: 528249788(0x1f7c73bc) KB 2025-10-10T00:33:57.4432851Z Allocatable: TRUE 2025-10-10T00:33:57.4433540Z Alloc Granule: 4KB 2025-10-10T00:33:57.4434378Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4435103Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4435816Z Accessible by all: TRUE 2025-10-10T00:33:57.4436435Z ISA Info: 2025-10-10T00:33:57.4436877Z ******* 2025-10-10T00:33:57.4437308Z Agent 2 2025-10-10T00:33:57.4437730Z ******* 2025-10-10T00:33:57.4438232Z Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.4438876Z Uuid: CPU-XX 2025-10-10T00:33:57.4439553Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.4440266Z Vendor Name: CPU 2025-10-10T00:33:57.4440930Z Feature: None specified 2025-10-10T00:33:57.4441596Z Profile: FULL_PROFILE 2025-10-10T00:33:57.4442274Z Float Round Mode: NEAR 2025-10-10T00:33:57.4442962Z Max Queue Number: 0(0x0) 2025-10-10T00:33:57.4443965Z Queue Min Size: 0(0x0) 2025-10-10T00:33:57.4444649Z Queue Max Size: 0(0x0) 2025-10-10T00:33:57.4445310Z Queue Type: MULTI 2025-10-10T00:33:57.4445929Z Node: 1 2025-10-10T00:33:57.4446556Z Device Type: CPU 2025-10-10T00:33:57.4447153Z Cache Info: 2025-10-10T00:33:57.4447950Z L1: 32768(0x8000) KB 2025-10-10T00:33:57.4448558Z Chip ID: 0(0x0) 2025-10-10T00:33:57.4449198Z ASIC Revision: 0(0x0) 2025-10-10T00:33:57.4449877Z Cacheline Size: 64(0x40) 2025-10-10T00:33:57.4450555Z Max Clock Freq. (MHz): 2000 2025-10-10T00:33:57.4451205Z BDFID: 0 2025-10-10T00:33:57.4451841Z Internal Node ID: 1 2025-10-10T00:33:57.4452510Z Compute Unit: 64 2025-10-10T00:33:57.4453156Z SIMDs per CU: 0 2025-10-10T00:33:57.4453813Z Shader Engines: 0 2025-10-10T00:33:57.4454512Z Shader Arrs. per Eng.: 0 2025-10-10T00:33:57.4455238Z WatchPts on Addr. Ranges:1 2025-10-10T00:33:57.4455866Z Memory Properties: 2025-10-10T00:33:57.4456343Z Features: None 2025-10-10T00:33:57.4456812Z Pool Info: 2025-10-10T00:33:57.4457261Z Pool 1 2025-10-10T00:33:57.4457830Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4458506Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:33:57.4459302Z Allocatable: TRUE 2025-10-10T00:33:57.4460114Z Alloc Granule: 4KB 2025-10-10T00:33:57.4460976Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4461772Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4462483Z Accessible by all: TRUE 2025-10-10T00:33:57.4463127Z Pool 2 2025-10-10T00:33:57.4463704Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4464393Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:33:57.4465070Z Allocatable: TRUE 2025-10-10T00:33:57.4465766Z Alloc Granule: 4KB 2025-10-10T00:33:57.4466491Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4467236Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4467950Z Accessible by all: TRUE 2025-10-10T00:33:57.4468564Z Pool 3 2025-10-10T00:33:57.4469133Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-10-10T00:33:57.4469784Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:33:57.4470438Z Allocatable: TRUE 2025-10-10T00:33:57.4471142Z Alloc Granule: 4KB 2025-10-10T00:33:57.4471866Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4472597Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4473312Z Accessible by all: TRUE 2025-10-10T00:33:57.4473923Z Pool 4 2025-10-10T00:33:57.4475122Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4475796Z Size: 528402452(0x1f7ec814) KB 2025-10-10T00:33:57.4476450Z Allocatable: TRUE 2025-10-10T00:33:57.4477140Z Alloc Granule: 4KB 2025-10-10T00:33:57.4477859Z Alloc Recommended Granule:4KB 2025-10-10T00:33:57.4478594Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4479828Z Accessible by all: TRUE 2025-10-10T00:33:57.4480589Z ISA Info: 2025-10-10T00:33:57.4481121Z ******* 2025-10-10T00:33:57.4481656Z Agent 3 2025-10-10T00:33:57.4482157Z ******* 2025-10-10T00:33:57.4482756Z Name: gfx90a 2025-10-10T00:33:57.4483531Z Uuid: GPU-963d686164f2ce12 2025-10-10T00:33:57.4484216Z Marketing Name: 2025-10-10T00:33:57.4484906Z Vendor Name: AMD 2025-10-10T00:33:57.4485575Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4486252Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4486944Z Float Round Mode: NEAR 2025-10-10T00:33:57.4487634Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4488321Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4489066Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4489858Z Queue Type: MULTI 2025-10-10T00:33:57.4490605Z Node: 2 2025-10-10T00:33:57.4491356Z Device Type: GPU 2025-10-10T00:33:57.4491977Z Cache Info: 2025-10-10T00:33:57.4492478Z L1: 16(0x10) KB 2025-10-10T00:33:57.4493067Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4493675Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4494339Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4495033Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4495739Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4496392Z BDFID: 12800 2025-10-10T00:33:57.4497062Z Internal Node ID: 2 2025-10-10T00:33:57.4497757Z Compute Unit: 104 2025-10-10T00:33:57.4498417Z SIMDs per CU: 4 2025-10-10T00:33:57.4499094Z Shader Engines: 8 2025-10-10T00:33:57.4499793Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4500507Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4501228Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4501872Z Memory Properties: 2025-10-10T00:33:57.4502390Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4503034Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4503768Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4504473Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4505123Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4505665Z x 1024(0x400) 2025-10-10T00:33:57.4506243Z y 1024(0x400) 2025-10-10T00:33:57.4507071Z z 1024(0x400) 2025-10-10T00:33:57.4507714Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4508415Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4509106Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4509724Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4510236Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4511143Z y 65535(0xffff) 2025-10-10T00:33:57.4511719Z z 65535(0xffff) 2025-10-10T00:33:57.4512380Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4513189Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4513931Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4514759Z IOMMU Support:: None 2025-10-10T00:33:57.4515376Z Pool Info: 2025-10-10T00:33:57.4515835Z Pool 1 2025-10-10T00:33:57.4516403Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4517078Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4517751Z Allocatable: TRUE 2025-10-10T00:33:57.4518443Z Alloc Granule: 4KB 2025-10-10T00:33:57.4519281Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4520154Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4520995Z Accessible by all: FALSE 2025-10-10T00:33:57.4521717Z Pool 2 2025-10-10T00:33:57.4522384Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4523190Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4523916Z Allocatable: TRUE 2025-10-10T00:33:57.4524619Z Alloc Granule: 4KB 2025-10-10T00:33:57.4525337Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4526061Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4526776Z Accessible by all: FALSE 2025-10-10T00:33:57.4527404Z Pool 3 2025-10-10T00:33:57.4527953Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4528610Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4529381Z Allocatable: TRUE 2025-10-10T00:33:57.4530204Z Alloc Granule: 4KB 2025-10-10T00:33:57.4531074Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4531874Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4532593Z Accessible by all: FALSE 2025-10-10T00:33:57.4533218Z Pool 4 2025-10-10T00:33:57.4533759Z Segment: GROUP 2025-10-10T00:33:57.4534407Z Size: 64(0x40) KB 2025-10-10T00:33:57.4535079Z Allocatable: FALSE 2025-10-10T00:33:57.4535786Z Alloc Granule: 0KB 2025-10-10T00:33:57.4536514Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4537250Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4537967Z Accessible by all: FALSE 2025-10-10T00:33:57.4538588Z ISA Info: 2025-10-10T00:33:57.4539368Z ISA 1 2025-10-10T00:33:57.4539984Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4540749Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4541486Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4542213Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4542959Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4543969Z Fast f16: TRUE 2025-10-10T00:33:57.4544666Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4545335Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4545924Z x 1024(0x400) 2025-10-10T00:33:57.4546529Z y 1024(0x400) 2025-10-10T00:33:57.4547109Z z 1024(0x400) 2025-10-10T00:33:57.4547755Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4548392Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4548920Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4549512Z y 65535(0xffff) 2025-10-10T00:33:57.4550104Z z 65535(0xffff) 2025-10-10T00:33:57.4550763Z FBarrier Max Size: 32 2025-10-10T00:33:57.4551374Z ******* 2025-10-10T00:33:57.4551809Z Agent 4 2025-10-10T00:33:57.4552271Z ******* 2025-10-10T00:33:57.4552769Z Name: gfx90a 2025-10-10T00:33:57.4553422Z Uuid: GPU-915b6eb937f8a736 2025-10-10T00:33:57.4554373Z Marketing Name: 2025-10-10T00:33:57.4555095Z Vendor Name: AMD 2025-10-10T00:33:57.4555780Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4556469Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4557161Z Float Round Mode: NEAR 2025-10-10T00:33:57.4557869Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4558568Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4559335Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4560130Z Queue Type: MULTI 2025-10-10T00:33:57.4560886Z Node: 3 2025-10-10T00:33:57.4561652Z Device Type: GPU 2025-10-10T00:33:57.4562371Z Cache Info: 2025-10-10T00:33:57.4562985Z L1: 16(0x10) KB 2025-10-10T00:33:57.4563705Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4564419Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4565094Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4565792Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4566492Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4567153Z BDFID: 13568 2025-10-10T00:33:57.4567814Z Internal Node ID: 3 2025-10-10T00:33:57.4568499Z Compute Unit: 104 2025-10-10T00:33:57.4569172Z SIMDs per CU: 4 2025-10-10T00:33:57.4569953Z Shader Engines: 8 2025-10-10T00:33:57.4571182Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4572087Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4572964Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4573734Z Memory Properties: 2025-10-10T00:33:57.4574346Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4575000Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4575713Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4576748Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4577409Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4577960Z x 1024(0x400) 2025-10-10T00:33:57.4578536Z y 1024(0x400) 2025-10-10T00:33:57.4579103Z z 1024(0x400) 2025-10-10T00:33:57.4579753Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4580456Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4581151Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4581769Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4582279Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4582854Z y 65535(0xffff) 2025-10-10T00:33:57.4583443Z z 65535(0xffff) 2025-10-10T00:33:57.4584106Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4584866Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4585602Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4586313Z IOMMU Support:: None 2025-10-10T00:33:57.4586929Z Pool Info: 2025-10-10T00:33:57.4587412Z Pool 1 2025-10-10T00:33:57.4587998Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4588698Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4589371Z Allocatable: TRUE 2025-10-10T00:33:57.4590085Z Alloc Granule: 4KB 2025-10-10T00:33:57.4590826Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4591579Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4592309Z Accessible by all: FALSE 2025-10-10T00:33:57.4592940Z Pool 2 2025-10-10T00:33:57.4593523Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4594321Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4595002Z Allocatable: TRUE 2025-10-10T00:33:57.4595709Z Alloc Granule: 4KB 2025-10-10T00:33:57.4596439Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4597177Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4597924Z Accessible by all: FALSE 2025-10-10T00:33:57.4598546Z Pool 3 2025-10-10T00:33:57.4599168Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4599949Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4600726Z Allocatable: TRUE 2025-10-10T00:33:57.4601552Z Alloc Granule: 4KB 2025-10-10T00:33:57.4602414Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4603678Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4604429Z Accessible by all: FALSE 2025-10-10T00:33:57.4605050Z Pool 4 2025-10-10T00:33:57.4605595Z Segment: GROUP 2025-10-10T00:33:57.4606221Z Size: 64(0x40) KB 2025-10-10T00:33:57.4606868Z Allocatable: FALSE 2025-10-10T00:33:57.4607878Z Alloc Granule: 0KB 2025-10-10T00:33:57.4608601Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4609330Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4610039Z Accessible by all: FALSE 2025-10-10T00:33:57.4610662Z ISA Info: 2025-10-10T00:33:57.4611115Z ISA 1 2025-10-10T00:33:57.4611713Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4612468Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4613207Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4613931Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4614670Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4615361Z Fast f16: TRUE 2025-10-10T00:33:57.4616059Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4616719Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4617314Z x 1024(0x400) 2025-10-10T00:33:57.4617926Z y 1024(0x400) 2025-10-10T00:33:57.4618512Z z 1024(0x400) 2025-10-10T00:33:57.4619170Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4619807Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4620351Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4620940Z y 65535(0xffff) 2025-10-10T00:33:57.4621510Z z 65535(0xffff) 2025-10-10T00:33:57.4622167Z FBarrier Max Size: 32 2025-10-10T00:33:57.4622801Z ******* 2025-10-10T00:33:57.4623252Z Agent 5 2025-10-10T00:33:57.4623690Z ******* 2025-10-10T00:33:57.4624196Z Name: gfx90a 2025-10-10T00:33:57.4624856Z Uuid: GPU-2e1c5b2ef60aec01 2025-10-10T00:33:57.4625533Z Marketing Name: 2025-10-10T00:33:57.4626233Z Vendor Name: AMD 2025-10-10T00:33:57.4626917Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4627604Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4628303Z Float Round Mode: NEAR 2025-10-10T00:33:57.4629035Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4629856Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4630673Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4631475Z Queue Type: MULTI 2025-10-10T00:33:57.4632160Z Node: 4 2025-10-10T00:33:57.4632805Z Device Type: GPU 2025-10-10T00:33:57.4633413Z Cache Info: 2025-10-10T00:33:57.4633918Z L1: 16(0x10) KB 2025-10-10T00:33:57.4634979Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4635612Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4636277Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4636974Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4637662Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4638308Z BDFID: 4352 2025-10-10T00:33:57.4652122Z Internal Node ID: 4 2025-10-10T00:33:57.4652804Z Compute Unit: 104 2025-10-10T00:33:57.4653459Z SIMDs per CU: 4 2025-10-10T00:33:57.4654134Z Shader Engines: 8 2025-10-10T00:33:57.4654826Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4655550Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4656286Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4656935Z Memory Properties: 2025-10-10T00:33:57.4657449Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4658097Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4658799Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4659519Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4660172Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4660723Z x 1024(0x400) 2025-10-10T00:33:57.4661301Z y 1024(0x400) 2025-10-10T00:33:57.4661862Z z 1024(0x400) 2025-10-10T00:33:57.4662490Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4663193Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4663889Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4664507Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4665015Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4665588Z y 65535(0xffff) 2025-10-10T00:33:57.4666157Z z 65535(0xffff) 2025-10-10T00:33:57.4666825Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4667581Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4668319Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4669037Z IOMMU Support:: None 2025-10-10T00:33:57.4669652Z Pool Info: 2025-10-10T00:33:57.4670123Z Pool 1 2025-10-10T00:33:57.4670710Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4671413Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4672089Z Allocatable: TRUE 2025-10-10T00:33:57.4672793Z Alloc Granule: 4KB 2025-10-10T00:33:57.4673517Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4674549Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4675318Z Accessible by all: FALSE 2025-10-10T00:33:57.4675959Z Pool 2 2025-10-10T00:33:57.4676531Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4677215Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4677887Z Allocatable: TRUE 2025-10-10T00:33:57.4678928Z Alloc Granule: 4KB 2025-10-10T00:33:57.4679816Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4680696Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4681545Z Accessible by all: FALSE 2025-10-10T00:33:57.4682279Z Pool 3 2025-10-10T00:33:57.4682945Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4683983Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4684639Z Allocatable: TRUE 2025-10-10T00:33:57.4685241Z Alloc Granule: 4KB 2025-10-10T00:33:57.4685877Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4686517Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4687151Z Accessible by all: FALSE 2025-10-10T00:33:57.4687697Z Pool 4 2025-10-10T00:33:57.4688166Z Segment: GROUP 2025-10-10T00:33:57.4688767Z Size: 64(0x40) KB 2025-10-10T00:33:57.4689343Z Allocatable: FALSE 2025-10-10T00:33:57.4689990Z Alloc Granule: 0KB 2025-10-10T00:33:57.4690758Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4691532Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4692273Z Accessible by all: FALSE 2025-10-10T00:33:57.4692840Z ISA Info: 2025-10-10T00:33:57.4693242Z ISA 1 2025-10-10T00:33:57.4693757Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4694440Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4695079Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4695708Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4696353Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4696949Z Fast f16: TRUE 2025-10-10T00:33:57.4697559Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4698149Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4698650Z x 1024(0x400) 2025-10-10T00:33:57.4699171Z y 1024(0x400) 2025-10-10T00:33:57.4699684Z z 1024(0x400) 2025-10-10T00:33:57.4700261Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4700822Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4701296Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4701811Z y 65535(0xffff) 2025-10-10T00:33:57.4702318Z z 65535(0xffff) 2025-10-10T00:33:57.4702931Z FBarrier Max Size: 32 2025-10-10T00:33:57.4703472Z ******* 2025-10-10T00:33:57.4703868Z Agent 6 2025-10-10T00:33:57.4704252Z ******* 2025-10-10T00:33:57.4704690Z Name: gfx90a 2025-10-10T00:33:57.4705259Z Uuid: GPU-885706dc5002792b 2025-10-10T00:33:57.4705854Z Marketing Name: 2025-10-10T00:33:57.4706456Z Vendor Name: AMD 2025-10-10T00:33:57.4707293Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4707905Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4708510Z Float Round Mode: NEAR 2025-10-10T00:33:57.4709121Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4709783Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4710490Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4711451Z Queue Type: MULTI 2025-10-10T00:33:57.4712103Z Node: 5 2025-10-10T00:33:57.4712763Z Device Type: GPU 2025-10-10T00:33:57.4713302Z Cache Info: 2025-10-10T00:33:57.4713746Z L1: 16(0x10) KB 2025-10-10T00:33:57.4714360Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4714891Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4715496Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4716211Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4716934Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4717613Z BDFID: 5120 2025-10-10T00:33:57.4718244Z Internal Node ID: 5 2025-10-10T00:33:57.4718842Z Compute Unit: 104 2025-10-10T00:33:57.4719429Z SIMDs per CU: 4 2025-10-10T00:33:57.4720018Z Shader Engines: 8 2025-10-10T00:33:57.4720631Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4721274Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4721918Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4722486Z Memory Properties: 2025-10-10T00:33:57.4742838Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4743572Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4744261Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4744925Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4745545Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4746077Z x 1024(0x400) 2025-10-10T00:33:57.4746624Z y 1024(0x400) 2025-10-10T00:33:57.4747135Z z 1024(0x400) 2025-10-10T00:33:57.4747731Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4748405Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4749043Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4749619Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4750101Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4750628Z y 65535(0xffff) 2025-10-10T00:33:57.4751139Z z 65535(0xffff) 2025-10-10T00:33:57.4751721Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4752420Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4753064Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4753680Z IOMMU Support:: None 2025-10-10T00:33:57.4754535Z Pool Info: 2025-10-10T00:33:57.4754981Z Pool 1 2025-10-10T00:33:57.4755541Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4756559Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4757192Z Allocatable: TRUE 2025-10-10T00:33:57.4757821Z Alloc Granule: 4KB 2025-10-10T00:33:57.4758477Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4759144Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4760187Z Accessible by all: FALSE 2025-10-10T00:33:57.4760842Z Pool 2 2025-10-10T00:33:57.4761452Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4762209Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4762952Z Allocatable: TRUE 2025-10-10T00:33:57.4763693Z Alloc Granule: 4KB 2025-10-10T00:33:57.4764495Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4765285Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4766043Z Accessible by all: FALSE 2025-10-10T00:33:57.4766601Z Pool 3 2025-10-10T00:33:57.4767104Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4767698Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4768313Z Allocatable: TRUE 2025-10-10T00:33:57.4768941Z Alloc Granule: 4KB 2025-10-10T00:33:57.4769590Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4770249Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4770891Z Accessible by all: FALSE 2025-10-10T00:33:57.4771458Z Pool 4 2025-10-10T00:33:57.4771943Z Segment: GROUP 2025-10-10T00:33:57.4772525Z Size: 64(0x40) KB 2025-10-10T00:33:57.4773122Z Allocatable: FALSE 2025-10-10T00:33:57.4773749Z Alloc Granule: 0KB 2025-10-10T00:33:57.4774399Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4775064Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4775701Z Accessible by all: FALSE 2025-10-10T00:33:57.4776263Z ISA Info: 2025-10-10T00:33:57.4776679Z ISA 1 2025-10-10T00:33:57.4777209Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4777904Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4778577Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4779232Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4779984Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4780701Z Fast f16: TRUE 2025-10-10T00:33:57.4781426Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4782132Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4782755Z x 1024(0x400) 2025-10-10T00:33:57.4783307Z y 1024(0x400) 2025-10-10T00:33:57.4783821Z z 1024(0x400) 2025-10-10T00:33:57.4784385Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4784965Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4785699Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4786229Z y 65535(0xffff) 2025-10-10T00:33:57.4786740Z z 65535(0xffff) 2025-10-10T00:33:57.4787324Z FBarrier Max Size: 32 2025-10-10T00:33:57.4787875Z ******* 2025-10-10T00:33:57.4788271Z Agent 7 2025-10-10T00:33:57.4788906Z ******* 2025-10-10T00:33:57.4789352Z Name: gfx90a 2025-10-10T00:33:57.4789938Z Uuid: GPU-052333bdda4adfee 2025-10-10T00:33:57.4790547Z Marketing Name: 2025-10-10T00:33:57.4791159Z Vendor Name: AMD 2025-10-10T00:33:57.4791769Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4792388Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4793000Z Float Round Mode: NEAR 2025-10-10T00:33:57.4793624Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4794342Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4794942Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4795535Z Queue Type: MULTI 2025-10-10T00:33:57.4796116Z Node: 6 2025-10-10T00:33:57.4796677Z Device Type: GPU 2025-10-10T00:33:57.4797229Z Cache Info: 2025-10-10T00:33:57.4797692Z L1: 16(0x10) KB 2025-10-10T00:33:57.4798233Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4798766Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4799348Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4800105Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4800829Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4801506Z BDFID: 44544 2025-10-10T00:33:57.4802205Z Internal Node ID: 6 2025-10-10T00:33:57.4802935Z Compute Unit: 104 2025-10-10T00:33:57.4803638Z SIMDs per CU: 4 2025-10-10T00:33:57.4804349Z Shader Engines: 8 2025-10-10T00:33:57.4805109Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4805865Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4806548Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4807129Z Memory Properties: 2025-10-10T00:33:57.4807617Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4808215Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4808873Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4809518Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4810107Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4810629Z x 1024(0x400) 2025-10-10T00:33:57.4811152Z y 1024(0x400) 2025-10-10T00:33:57.4811713Z z 1024(0x400) 2025-10-10T00:33:57.4812384Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4813129Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4814234Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4814897Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4815367Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4815892Z y 65535(0xffff) 2025-10-10T00:33:57.4816399Z z 65535(0xffff) 2025-10-10T00:33:57.4816982Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4817668Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4818600Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4819236Z IOMMU Support:: None 2025-10-10T00:33:57.4819790Z Pool Info: 2025-10-10T00:33:57.4820212Z Pool 1 2025-10-10T00:33:57.4820740Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4821366Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4821985Z Allocatable: TRUE 2025-10-10T00:33:57.4822619Z Alloc Granule: 4KB 2025-10-10T00:33:57.4823288Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4823963Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4824616Z Accessible by all: FALSE 2025-10-10T00:33:57.4825188Z Pool 2 2025-10-10T00:33:57.4825707Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4826322Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4826926Z Allocatable: TRUE 2025-10-10T00:33:57.4827551Z Alloc Granule: 4KB 2025-10-10T00:33:57.4828219Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4828883Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4829522Z Accessible by all: FALSE 2025-10-10T00:33:57.4830079Z Pool 3 2025-10-10T00:33:57.4830576Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4831174Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4831757Z Allocatable: TRUE 2025-10-10T00:33:57.4832395Z Alloc Granule: 4KB 2025-10-10T00:33:57.4833048Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4833705Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4834427Z Accessible by all: FALSE 2025-10-10T00:33:57.4834980Z Pool 4 2025-10-10T00:33:57.4835466Z Segment: GROUP 2025-10-10T00:33:57.4836040Z Size: 64(0x40) KB 2025-10-10T00:33:57.4836620Z Allocatable: FALSE 2025-10-10T00:33:57.4837238Z Alloc Granule: 0KB 2025-10-10T00:33:57.4837884Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4838536Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4839186Z Accessible by all: FALSE 2025-10-10T00:33:57.4839791Z ISA Info: 2025-10-10T00:33:57.4840272Z ISA 1 2025-10-10T00:33:57.4840887Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4841682Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4842860Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4843639Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4844422Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4845159Z Fast f16: TRUE 2025-10-10T00:33:57.4845884Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4846524Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4847326Z x 1024(0x400) 2025-10-10T00:33:57.4847864Z y 1024(0x400) 2025-10-10T00:33:57.4848369Z z 1024(0x400) 2025-10-10T00:33:57.4848941Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4849511Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4849987Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4850502Z y 65535(0xffff) 2025-10-10T00:33:57.4851022Z z 65535(0xffff) 2025-10-10T00:33:57.4851635Z FBarrier Max Size: 32 2025-10-10T00:33:57.4852282Z ******* 2025-10-10T00:33:57.4852733Z Agent 8 2025-10-10T00:33:57.4853190Z ******* 2025-10-10T00:33:57.4853718Z Name: gfx90a 2025-10-10T00:33:57.4854388Z Uuid: GPU-648b8d31dd305074 2025-10-10T00:33:57.4855047Z Marketing Name: 2025-10-10T00:33:57.4855654Z Vendor Name: AMD 2025-10-10T00:33:57.4856249Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4856862Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4857477Z Float Round Mode: NEAR 2025-10-10T00:33:57.4858098Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4858701Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4859290Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4859881Z Queue Type: MULTI 2025-10-10T00:33:57.4860440Z Node: 7 2025-10-10T00:33:57.4861007Z Device Type: GPU 2025-10-10T00:33:57.4861545Z Cache Info: 2025-10-10T00:33:57.4862024Z L1: 16(0x10) KB 2025-10-10T00:33:57.4862567Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4863101Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4863694Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4864298Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4864909Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4865476Z BDFID: 45824 2025-10-10T00:33:57.4866049Z Internal Node ID: 7 2025-10-10T00:33:57.4866652Z Compute Unit: 104 2025-10-10T00:33:57.4867246Z SIMDs per CU: 4 2025-10-10T00:33:57.4867846Z Shader Engines: 8 2025-10-10T00:33:57.4868467Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4869094Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4869723Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4870530Z Memory Properties: 2025-10-10T00:33:57.4870999Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4871571Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4872195Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4872810Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4873376Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4873848Z x 1024(0x400) 2025-10-10T00:33:57.4874841Z y 1024(0x400) 2025-10-10T00:33:57.4875345Z z 1024(0x400) 2025-10-10T00:33:57.4875926Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4876553Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4877166Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4877730Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4878190Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4878700Z y 65535(0xffff) 2025-10-10T00:33:57.4879194Z z 65535(0xffff) 2025-10-10T00:33:57.4879797Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4880584Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4881349Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4882087Z IOMMU Support:: None 2025-10-10T00:33:57.4882720Z Pool Info: 2025-10-10T00:33:57.4883205Z Pool 1 2025-10-10T00:33:57.4883802Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4884516Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4885230Z Allocatable: TRUE 2025-10-10T00:33:57.4885964Z Alloc Granule: 4KB 2025-10-10T00:33:57.4886657Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4887312Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4887939Z Accessible by all: FALSE 2025-10-10T00:33:57.4888482Z Pool 2 2025-10-10T00:33:57.4888995Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4889586Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4890163Z Allocatable: TRUE 2025-10-10T00:33:57.4890780Z Alloc Granule: 4KB 2025-10-10T00:33:57.4891425Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4892193Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4892929Z Accessible by all: FALSE 2025-10-10T00:33:57.4893571Z Pool 3 2025-10-10T00:33:57.4894149Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4894849Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4895442Z Allocatable: TRUE 2025-10-10T00:33:57.4896065Z Alloc Granule: 4KB 2025-10-10T00:33:57.4896705Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4897347Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4897982Z Accessible by all: FALSE 2025-10-10T00:33:57.4898527Z Pool 4 2025-10-10T00:33:57.4898996Z Segment: GROUP 2025-10-10T00:33:57.4899864Z Size: 64(0x40) KB 2025-10-10T00:33:57.4900460Z Allocatable: FALSE 2025-10-10T00:33:57.4901071Z Alloc Granule: 0KB 2025-10-10T00:33:57.4901709Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4902359Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4903266Z Accessible by all: FALSE 2025-10-10T00:33:57.4903814Z ISA Info: 2025-10-10T00:33:57.4904208Z ISA 1 2025-10-10T00:33:57.4904718Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4905396Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4906042Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4906686Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4907336Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4907939Z Fast f16: TRUE 2025-10-10T00:33:57.4908544Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4909129Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4909636Z x 1024(0x400) 2025-10-10T00:33:57.4910165Z y 1024(0x400) 2025-10-10T00:33:57.4910664Z z 1024(0x400) 2025-10-10T00:33:57.4911212Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4911772Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4912254Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4912779Z y 65535(0xffff) 2025-10-10T00:33:57.4913286Z z 65535(0xffff) 2025-10-10T00:33:57.4913853Z FBarrier Max Size: 32 2025-10-10T00:33:57.4914497Z ******* 2025-10-10T00:33:57.4914888Z Agent 9 2025-10-10T00:33:57.4915268Z ******* 2025-10-10T00:33:57.4915720Z Name: gfx90a 2025-10-10T00:33:57.4916297Z Uuid: GPU-065f01543a0c255e 2025-10-10T00:33:57.4916888Z Marketing Name: 2025-10-10T00:33:57.4917493Z Vendor Name: AMD 2025-10-10T00:33:57.4918092Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4918692Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4919301Z Float Round Mode: NEAR 2025-10-10T00:33:57.4919960Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4920664Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4921359Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4922052Z Queue Type: MULTI 2025-10-10T00:33:57.4922725Z Node: 8 2025-10-10T00:33:57.4923400Z Device Type: GPU 2025-10-10T00:33:57.4924035Z Cache Info: 2025-10-10T00:33:57.4924557Z L1: 16(0x10) KB 2025-10-10T00:33:57.4925173Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4925806Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4926450Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4927354Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4927991Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4928568Z BDFID: 36352 2025-10-10T00:33:57.4929151Z Internal Node ID: 8 2025-10-10T00:33:57.4929749Z Compute Unit: 104 2025-10-10T00:33:57.4930326Z SIMDs per CU: 4 2025-10-10T00:33:57.4931225Z Shader Engines: 8 2025-10-10T00:33:57.4931895Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4932640Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4933380Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4934045Z Memory Properties: 2025-10-10T00:33:57.4934590Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4935218Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4935827Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4936441Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4937001Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4937475Z x 1024(0x400) 2025-10-10T00:33:57.4937968Z y 1024(0x400) 2025-10-10T00:33:57.4938464Z z 1024(0x400) 2025-10-10T00:33:57.4939025Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4939637Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4940236Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4940781Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4941228Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4941722Z y 65535(0xffff) 2025-10-10T00:33:57.4942216Z z 65535(0xffff) 2025-10-10T00:33:57.4942794Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4943453Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4944086Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4944707Z IOMMU Support:: None 2025-10-10T00:33:57.4945246Z Pool Info: 2025-10-10T00:33:57.4945639Z Pool 1 2025-10-10T00:33:57.4946149Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4946763Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4947370Z Allocatable: TRUE 2025-10-10T00:33:57.4947994Z Alloc Granule: 4KB 2025-10-10T00:33:57.4948647Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4949312Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4949946Z Accessible by all: FALSE 2025-10-10T00:33:57.4950500Z Pool 2 2025-10-10T00:33:57.4951000Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4951613Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4952186Z Allocatable: TRUE 2025-10-10T00:33:57.4952801Z Alloc Granule: 4KB 2025-10-10T00:33:57.4953437Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4954325Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4955356Z Accessible by all: FALSE 2025-10-10T00:33:57.4955968Z Pool 3 2025-10-10T00:33:57.4956459Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4957034Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4957612Z Allocatable: TRUE 2025-10-10T00:33:57.4958219Z Alloc Granule: 4KB 2025-10-10T00:33:57.4959142Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4959799Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4960531Z Accessible by all: FALSE 2025-10-10T00:33:57.4961165Z Pool 4 2025-10-10T00:33:57.4961716Z Segment: GROUP 2025-10-10T00:33:57.4962384Z Size: 64(0x40) KB 2025-10-10T00:33:57.4963046Z Allocatable: FALSE 2025-10-10T00:33:57.4963757Z Alloc Granule: 0KB 2025-10-10T00:33:57.4964503Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4965265Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4965785Z Accessible by all: FALSE 2025-10-10T00:33:57.4966093Z ISA Info: 2025-10-10T00:33:57.4966295Z ISA 1 2025-10-10T00:33:57.4966534Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4966845Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4967141Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4967432Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4967738Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4968018Z Fast f16: TRUE 2025-10-10T00:33:57.4968290Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4968561Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4968796Z x 1024(0x400) 2025-10-10T00:33:57.4969035Z y 1024(0x400) 2025-10-10T00:33:57.4969269Z z 1024(0x400) 2025-10-10T00:33:57.4969525Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4969784Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4969999Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4970234Z y 65535(0xffff) 2025-10-10T00:33:57.4970474Z z 65535(0xffff) 2025-10-10T00:33:57.4970735Z FBarrier Max Size: 32 2025-10-10T00:33:57.4970979Z ******* 2025-10-10T00:33:57.4971159Z Agent 10 2025-10-10T00:33:57.4971336Z ******* 2025-10-10T00:33:57.4971537Z Name: gfx90a 2025-10-10T00:33:57.4971794Z Uuid: GPU-6d0b1913df2b2636 2025-10-10T00:33:57.4972074Z Marketing Name: 2025-10-10T00:33:57.4972350Z Vendor Name: AMD 2025-10-10T00:33:57.4972619Z Feature: KERNEL_DISPATCH 2025-10-10T00:33:57.4972894Z Profile: BASE_PROFILE 2025-10-10T00:33:57.4973168Z Float Round Mode: NEAR 2025-10-10T00:33:57.4973571Z Max Queue Number: 128(0x80) 2025-10-10T00:33:57.4973855Z Queue Min Size: 64(0x40) 2025-10-10T00:33:57.4974128Z Queue Max Size: 131072(0x20000) 2025-10-10T00:33:57.4974400Z Queue Type: MULTI 2025-10-10T00:33:57.4974653Z Node: 9 2025-10-10T00:33:57.4974904Z Device Type: GPU 2025-10-10T00:33:57.4975258Z Cache Info: 2025-10-10T00:33:57.4975459Z L1: 16(0x10) KB 2025-10-10T00:33:57.4975696Z L2: 8192(0x2000) KB 2025-10-10T00:33:57.4975944Z Chip ID: 29708(0x740c) 2025-10-10T00:33:57.4976204Z ASIC Revision: 1(0x1) 2025-10-10T00:33:57.4976481Z Cacheline Size: 128(0x80) 2025-10-10T00:33:57.4976767Z Max Clock Freq. (MHz): 1700 2025-10-10T00:33:57.4977033Z BDFID: 37632 2025-10-10T00:33:57.4977296Z Internal Node ID: 9 2025-10-10T00:33:57.4977571Z Compute Unit: 104 2025-10-10T00:33:57.4977839Z SIMDs per CU: 4 2025-10-10T00:33:57.4978111Z Shader Engines: 8 2025-10-10T00:33:57.4978406Z Shader Arrs. per Eng.: 1 2025-10-10T00:33:57.4978694Z WatchPts on Addr. Ranges:4 2025-10-10T00:33:57.4978987Z Coherent Host Access: FALSE 2025-10-10T00:33:57.4979244Z Memory Properties: 2025-10-10T00:33:57.4979454Z Features: KERNEL_DISPATCH 2025-10-10T00:33:57.4979720Z Fast F16 Operation: TRUE 2025-10-10T00:33:57.4980001Z Wavefront Size: 64(0x40) 2025-10-10T00:33:57.4980283Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4980542Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4980761Z x 1024(0x400) 2025-10-10T00:33:57.4980992Z y 1024(0x400) 2025-10-10T00:33:57.4981223Z z 1024(0x400) 2025-10-10T00:33:57.4981480Z Max Waves Per CU: 32(0x20) 2025-10-10T00:33:57.4981763Z Max Work-item Per CU: 2048(0x800) 2025-10-10T00:33:57.4982045Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4982302Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4982502Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4982740Z y 65535(0xffff) 2025-10-10T00:33:57.4982972Z z 65535(0xffff) 2025-10-10T00:33:57.4983239Z Max fbarriers/Workgrp: 32 2025-10-10T00:33:57.4983542Z Packet Processor uCode:: 92 2025-10-10T00:33:57.4983838Z SDMA engine uCode:: 9 2025-10-10T00:33:57.4984129Z IOMMU Support:: None 2025-10-10T00:33:57.4984376Z Pool Info: 2025-10-10T00:33:57.4984558Z Pool 1 2025-10-10T00:33:57.4984789Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-10-10T00:33:57.4985068Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4985339Z Allocatable: TRUE 2025-10-10T00:33:57.4985624Z Alloc Granule: 4KB 2025-10-10T00:33:57.4986034Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4986344Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4986634Z Accessible by all: FALSE 2025-10-10T00:33:57.4986883Z Pool 2 2025-10-10T00:33:57.4987110Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-10-10T00:33:57.4987389Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4987766Z Allocatable: TRUE 2025-10-10T00:33:57.4988045Z Alloc Granule: 4KB 2025-10-10T00:33:57.4988342Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4988641Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4988929Z Accessible by all: FALSE 2025-10-10T00:33:57.4989180Z Pool 3 2025-10-10T00:33:57.4989407Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-10-10T00:33:57.4989675Z Size: 67092480(0x3ffc000) KB 2025-10-10T00:33:57.4989941Z Allocatable: TRUE 2025-10-10T00:33:57.4990220Z Alloc Granule: 4KB 2025-10-10T00:33:57.4990509Z Alloc Recommended Granule:2048KB 2025-10-10T00:33:57.4990810Z Alloc Alignment: 4KB 2025-10-10T00:33:57.4991098Z Accessible by all: FALSE 2025-10-10T00:33:57.4991347Z Pool 4 2025-10-10T00:33:57.4991561Z Segment: GROUP 2025-10-10T00:33:57.4991815Z Size: 64(0x40) KB 2025-10-10T00:33:57.4992083Z Allocatable: FALSE 2025-10-10T00:33:57.4992363Z Alloc Granule: 0KB 2025-10-10T00:33:57.4992655Z Alloc Recommended Granule:0KB 2025-10-10T00:33:57.4992945Z Alloc Alignment: 0KB 2025-10-10T00:33:57.4993226Z Accessible by all: FALSE 2025-10-10T00:33:57.4993473Z ISA Info: 2025-10-10T00:33:57.4993658Z ISA 1 2025-10-10T00:33:57.4993890Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-10-10T00:33:57.4994234Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-10-10T00:33:57.4994521Z Profiles: HSA_PROFILE_BASE 2025-10-10T00:33:57.4994809Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4995115Z Default Rounding Mode: NEAR 2025-10-10T00:33:57.4995393Z Fast f16: TRUE 2025-10-10T00:33:57.4995671Z Workgroup Max Size: 1024(0x400) 2025-10-10T00:33:57.4995936Z Workgroup Max Size per Dimension: 2025-10-10T00:33:57.4996173Z x 1024(0x400) 2025-10-10T00:33:57.4996406Z y 1024(0x400) 2025-10-10T00:33:57.4996640Z z 1024(0x400) 2025-10-10T00:33:57.4996905Z Grid Max Size: 4294967295(0xffffffff) 2025-10-10T00:33:57.4997162Z Grid Max Size per Dimension: 2025-10-10T00:33:57.4997375Z x 2147483647(0x7fffffff) 2025-10-10T00:33:57.4997610Z y 65535(0xffff) 2025-10-10T00:33:57.4997838Z z 65535(0xffff) 2025-10-10T00:33:57.4998242Z FBarrier Max Size: 32 2025-10-10T00:33:57.4998499Z *** Done *** 2025-10-10T00:33:57.4998681Z + rocminfo 2025-10-10T00:33:57.4998848Z + grep -E 'Name:.*\sgfx|Marketing' 2025-10-10T00:33:57.6810598Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.6812360Z Marketing Name: AMD EPYC 7713 64-Core Processor 2025-10-10T00:33:57.6813302Z Name: gfx90a 2025-10-10T00:33:57.6814720Z Marketing Name: 2025-10-10T00:33:57.6815411Z Name: gfx90a 2025-10-10T00:33:57.6816098Z Marketing Name: 2025-10-10T00:33:57.6816756Z Name: gfx90a 2025-10-10T00:33:57.6817411Z Marketing Name: 2025-10-10T00:33:57.6818078Z Name: gfx90a 2025-10-10T00:33:57.6818728Z Marketing Name: 2025-10-10T00:33:57.6819371Z Name: gfx90a 2025-10-10T00:33:57.6820012Z Marketing Name: 2025-10-10T00:33:57.6820662Z Name: gfx90a 2025-10-10T00:33:57.6821307Z Marketing Name: 2025-10-10T00:33:57.6821967Z Name: gfx90a 2025-10-10T00:33:57.6822616Z Marketing Name: 2025-10-10T00:33:57.6823263Z Name: gfx90a 2025-10-10T00:33:57.6824007Z Marketing Name: 2025-10-10T00:33:57.6983365Z + MAYBE_ROCM=rocm/ 2025-10-10T00:33:57.6983979Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-10-10T00:33:57.6984720Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-10-10T00:33:57.6985349Z + pip_install ninja==1.10.2 2025-10-10T00:33:57.6986043Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-10-10T00:33:57.6987015Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-10-10T00:33:58.2067665Z Collecting ninja==1.10.2 2025-10-10T00:33:58.2578857Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-10-10T00:33:58.2777547Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-10-10T00:33:58.6138945Z Installing collected packages: ninja 2025-10-10T00:33:58.6139603Z Attempting uninstall: ninja 2025-10-10T00:33:58.6145349Z Found existing installation: ninja 1.11.1.4 2025-10-10T00:33:58.6165047Z Uninstalling ninja-1.11.1.4: 2025-10-10T00:33:58.6248445Z Successfully uninstalled ninja-1.11.1.4 2025-10-10T00:33:58.6538397Z Successfully installed ninja-1.10.2 2025-10-10T00:33:58.7121006Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:33:58.7124739Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:33:58.7126754Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-10-10T00:33:58.7127361Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-10-10T00:33:58.7127983Z + [[ linux-jammy-rocm-py3.10 == *-debug* ]] 2025-10-10T00:33:58.7128602Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-10-10T00:33:58.7129449Z + echo 'We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass' 2025-10-10T00:33:58.7131221Z We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass 2025-10-10T00:33:58.7132012Z + cd test 2025-10-10T00:33:58.7132617Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-10-10T00:34:00.2307206Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-10-10T00:34:00.2307933Z + [[ default == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-10-10T00:34:00.2308597Z + [[ default == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-10-10T00:34:00.2317019Z + DYNAMO_BENCHMARK_FLAGS=() 2025-10-10T00:34:00.2317663Z + [[ default == *pr_time_benchmarks* ]] 2025-10-10T00:34:00.2319329Z + [[ default == *dynamo_eager* ]] 2025-10-10T00:34:00.2319945Z + [[ default == *aot_eager* ]] 2025-10-10T00:34:00.2320503Z + [[ default == *aot_inductor* ]] 2025-10-10T00:34:00.2321037Z + [[ default == *max_autotune_inductor* ]] 2025-10-10T00:34:00.2321587Z + [[ default == *inductor* ]] 2025-10-10T00:34:00.2322063Z + [[ default == *dynamic* ]] 2025-10-10T00:34:00.2322527Z + [[ default == *cpu* ]] 2025-10-10T00:34:00.2323037Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-10-10T00:34:00.2360446Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-10-10T00:34:00.2361177Z + [[ linux-jammy-rocm-py3.10 == *-bazel-* ]] 2025-10-10T00:34:00.2368811Z + cd test 2025-10-10T00:34:00.2370748Z + python -c 'import torch; print(torch.__config__.show())' 2025-10-10T00:34:01.6809399Z PyTorch built with: 2025-10-10T00:34:01.6809926Z - GCC 11.4 2025-10-10T00:34:01.6810391Z - C++ Version: 201703 2025-10-10T00:34:01.6811462Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-10-10T00:34:01.6812786Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-10-10T00:34:01.6813582Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-10-10T00:34:01.6814193Z - LAPACK is enabled (usually provided by MKL) 2025-10-10T00:34:01.6814783Z - NNPACK is enabled 2025-10-10T00:34:01.6815255Z - CPU capability usage: AVX2 2025-10-10T00:34:01.6815755Z - HIP Runtime 7.0.51831 2025-10-10T00:34:01.6816220Z - MIOpen 3.5.0 2025-10-10T00:34:01.6816624Z - Magma 2.9.0 2025-10-10T00:34:01.6824347Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-10-10T00:34:01.6832246Z 2025-10-10T00:34:02.0044171Z + cd test 2025-10-10T00:34:02.0044931Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-10-10T00:34:03.2225702Z ATen/Parallel: 2025-10-10T00:34:03.2226269Z at::get_num_threads() : 128 2025-10-10T00:34:03.2226844Z at::get_num_interop_threads() : 128 2025-10-10T00:34:03.2227452Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-10-10T00:34:03.2228014Z omp_get_max_threads() : 128 2025-10-10T00:34:03.2229009Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-10-10T00:34:03.2230049Z mkl_get_max_threads() : 128 2025-10-10T00:34:03.2230729Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-10-10T00:34:03.2232143Z std::thread::hardware_concurrency() : 128 2025-10-10T00:34:03.2232732Z Environment variables: 2025-10-10T00:34:03.2233190Z OMP_NUM_THREADS : [not set] 2025-10-10T00:34:03.2233661Z MKL_NUM_THREADS : [not set] 2025-10-10T00:34:03.2234290Z ATen parallel backend: OpenMP 2025-10-10T00:34:03.2234609Z 2025-10-10T00:34:03.4941762Z + [[ default == *numpy_2* ]] 2025-10-10T00:34:03.4942422Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-10-10T00:34:03.4943069Z + [[ default == *backward* ]] 2025-10-10T00:34:03.4943589Z + [[ default == *xla* ]] 2025-10-10T00:34:03.4944624Z + [[ default == *vllm* ]] 2025-10-10T00:34:03.4945089Z + [[ default == *executorch* ]] 2025-10-10T00:34:03.4945599Z + [[ default == \j\i\t\_\l\e\g\a\c\y ]] 2025-10-10T00:34:03.4946220Z + [[ default == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-10-10T00:34:03.4946918Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-10-10T00:34:03.4947601Z + [[ default == distributed ]] 2025-10-10T00:34:03.4948199Z + [[ default == *operator_benchmark* ]] 2025-10-10T00:34:03.4948786Z + [[ default == *operator_microbenchmark* ]] 2025-10-10T00:34:03.4949369Z + [[ default == *inductor_distributed* ]] 2025-10-10T00:34:03.4949932Z + [[ default == *inductor-halide* ]] 2025-10-10T00:34:03.4950486Z + [[ default == *inductor-triton-cpu* ]] 2025-10-10T00:34:03.4951078Z + [[ default == *inductor-micro-benchmark* ]] 2025-10-10T00:34:03.4951652Z + [[ default == *huggingface* ]] 2025-10-10T00:34:03.4952139Z + [[ default == *timm* ]] 2025-10-10T00:34:03.4952589Z + [[ default == cachebench ]] 2025-10-10T00:34:03.4953089Z + [[ default == verify_cachebench ]] 2025-10-10T00:34:03.4953601Z + [[ default == *torchbench* ]] 2025-10-10T00:34:03.4954255Z + [[ default == *inductor_cpp_wrapper* ]] 2025-10-10T00:34:03.4954797Z + [[ default == *inductor* ]] 2025-10-10T00:34:03.4955264Z + [[ default == *einops* ]] 2025-10-10T00:34:03.4955740Z + [[ default == *dynamo_wrapped* ]] 2025-10-10T00:34:03.4956270Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-10-10T00:34:03.4956782Z + [[ -n '' ]] 2025-10-10T00:34:03.4957151Z + [[ 1 == 1 ]] 2025-10-10T00:34:03.4957537Z + [[ 6 -gt 1 ]] 2025-10-10T00:34:03.4957979Z + test_lazy_tensor_meta_reference_disabled 2025-10-10T00:34:03.4958685Z + export TORCH_DISABLE_FUNCTIONALIZATION_META_REFERENCE=1 2025-10-10T00:34:03.4959439Z + TORCH_DISABLE_FUNCTIONALIZATION_META_REFERENCE=1 2025-10-10T00:34:03.4960189Z + echo 'Testing lazy tensor operations without meta reference' 2025-10-10T00:34:03.4960957Z Testing lazy tensor operations without meta reference 2025-10-10T00:34:03.4961764Z + python test/run_test.py --include lazy/test_ts_opinfo.py --verbose 2025-10-10T00:34:07.4125540Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-10-10T00:34:07.5789152Z Ignoring disabled issues: [''] 2025-10-10T00:34:07.6002393Z Found test times from artifacts 2025-10-10T00:34:07.6499330Z Found test times from artifacts 2025-10-10T00:34:07.6508322Z Running all tests 2025-10-10T00:34:07.6509233Z Running parallel tests on 8 processes 2025-10-10T00:34:07.6509924Z Name: tests to run (est. time: 0.01min) 2025-10-10T00:34:07.6510489Z Serial tests (0): 2025-10-10T00:34:07.6510930Z Parallel tests (1): 2025-10-10T00:34:07.6511397Z lazy/test_ts_opinfo 1/1 2025-10-10T00:34:07.6511924Z Name: excluded (est. time: 0.0min) 2025-10-10T00:34:07.6512411Z Serial tests (0): 2025-10-10T00:34:07.6512817Z Parallel tests (0): 2025-10-10T00:34:07.6513460Z Running lazy/test_ts_opinfo 1/1 ... [2025-10-10 00:34:07.650874] 2025-10-10T00:34:07.6514599Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T00:34:07.6517550Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_ts_opinfo.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:34:07.651232] 2025-10-10T00:34:11.8766829Z 2025-10-10T00:34:11.8769236Z lazy/test_ts_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_ts_opinfo_1.1_3d5568c52c511dd5_.log 2025-10-10T00:34:11.8770724Z Running 0 items in this shard: 2025-10-10T00:34:11.8771084Z 2025-10-10T00:34:11.8771545Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T00:34:11.8772396Z Uploading artifacts took 0.00 seconds 2025-10-10T00:34:15.3493570Z Running lazy/test_ts_opinfo 1/1 ... [2025-10-10 00:34:15.348172] 2025-10-10T00:34:15.3494336Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T00:34:15.3496815Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_ts_opinfo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:34:15.348728] 2025-10-10T00:34:19.9260255Z 2025-10-10T00:34:19.9261445Z lazy/test_ts_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_ts_opinfo_1.1_c90f6d0202fe3185_.log 2025-10-10T00:34:19.9265090Z Running 5 items in this shard: test/lazy/test_ts_opinfo.py::TestLazyTensor::testConvolutionBackward, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_tensor_ctr, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_view_mark_step_preserved, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_adaptiveavgpool3d_dynamic, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_nonzero_dynamic 2025-10-10T00:34:19.9267850Z 2025-10-10T00:34:20.7674046Z Running test batch 'tests to run' cost 13.12 seconds 2025-10-10T00:34:21.3153249Z 2025-10-10T00:34:21.3153674Z real 0m17.821s 2025-10-10T00:34:21.3154549Z user 0m43.761s 2025-10-10T00:34:21.3155036Z sys 1m7.663s 2025-10-10T00:34:21.3155769Z + export -n TORCH_DISABLE_FUNCTIONALIZATION_META_REFERENCE 2025-10-10T00:34:21.3156449Z + test_without_numpy 2025-10-10T00:34:21.3180455Z ++ dirname .ci/pytorch/test.sh 2025-10-10T00:34:21.3197296Z + pushd .ci/pytorch 2025-10-10T00:34:21.3197962Z ~/pytorch/.ci/pytorch ~/pytorch 2025-10-10T00:34:21.3199732Z + python -c 'import sys;sys.path.insert(0, '\''fake_numpy'\'');from unittest import TestCase;import torch;x=torch.randn(3,3);TestCase().assertRaises(RuntimeError, lambda: x.numpy())' 2025-10-10T00:34:22.0928703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:280: UserWarning: Failed to initialize NumPy: Sorry PyTorch, but our NumPy is in the other folder (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/utils/tensor_numpy.cpp:84.) 2025-10-10T00:34:22.0931283Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-10-10T00:34:22.8452341Z + python -c 'import sys;sys.path.insert(0, '\''fake_numpy'\'');import torch;print(torch.tensor([torch.tensor(0.), torch.tensor(1.)]))' 2025-10-10T00:34:23.6217964Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:280: UserWarning: Failed to initialize NumPy: Sorry PyTorch, but our NumPy is in the other folder (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/utils/tensor_numpy.cpp:84.) 2025-10-10T00:34:23.6220436Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-10-10T00:34:23.9970170Z tensor([0., 1.]) 2025-10-10T00:34:24.2588842Z + [[ default == *dynamo_wrapped* ]] 2025-10-10T00:34:24.2589819Z + python -c 'import sys;sys.path.insert(0, '\''fake_numpy'\'');import torch; import torch.onnx' 2025-10-10T00:34:25.0262967Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:280: UserWarning: Failed to initialize NumPy: Sorry PyTorch, but our NumPy is in the other folder (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/utils/tensor_numpy.cpp:84.) 2025-10-10T00:34:25.0265398Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-10-10T00:34:25.6655022Z + popd 2025-10-10T00:34:25.6656772Z ~/pytorch 2025-10-10T00:34:25.6657285Z + install_torchvision 2025-10-10T00:34:25.6657797Z + local orig_preload 2025-10-10T00:34:25.6658235Z + local commit 2025-10-10T00:34:25.6667897Z ++ get_pinned_commit vision 2025-10-10T00:34:25.6668563Z ++ cat .github/ci_commit_pins/vision.txt 2025-10-10T00:34:25.6701675Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:25.6702413Z + orig_preload= 2025-10-10T00:34:25.6702851Z + '[' -n '' ']' 2025-10-10T00:34:25.6703330Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-10-10T00:34:25.6704522Z + pip_build_and_install git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 dist/vision 2025-10-10T00:34:25.6707005Z + local build_target=git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:25.6708092Z + local wheel_dir=dist/vision 2025-10-10T00:34:25.6708574Z + local found_whl=0 2025-10-10T00:34:25.6709017Z + for file in "${wheel_dir}"/*.whl 2025-10-10T00:34:25.6709536Z + [[ -f dist/vision/*.whl ]] 2025-10-10T00:34:25.6709990Z + '[' 0 == 0 ']' 2025-10-10T00:34:25.6711293Z + python3 -m pip wheel --no-build-isolation --no-deps --no-use-pep517 -w dist/vision git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:26.0549032Z Collecting git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:26.0552333Z Cloning https://github.com/pytorch/vision.git (to revision 966da7e46f65d6d49df3e31214470a4fe5cc8e66) to /tmp/pip-req-build-m15484c8 2025-10-10T00:34:26.0619420Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-m15484c8 2025-10-10T00:34:28.2256727Z Running command git rev-parse -q --verify 'sha^966da7e46f65d6d49df3e31214470a4fe5cc8e66' 2025-10-10T00:34:28.2318252Z Running command git fetch -q https://github.com/pytorch/vision.git 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:28.6164512Z Running command git checkout -q 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:29.1448990Z Resolved https://github.com/pytorch/vision.git to commit 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:34:31.8765971Z Preparing metadata (setup.py) ... [?25l- \ | / done 2025-10-10T00:34:31.8824887Z [?25hBuilding wheels for collected packages: torchvision 2025-10-10T00:34:31.8978321Z  DEPRECATION: Building 'torchvision' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torchvision'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T00:35:35.9442408Z  Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ done 2025-10-10T00:35:35.9483560Z [?25h Created wheel for torchvision: filename=torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl size=1713469 sha256=d4c4932f682b4db63bc07672d9c6e906e30bb6ce599e2f33f75d22ed5850f688 2025-10-10T00:35:35.9488581Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/9c/9d/3e/42fa2d5ac6ba44a90363f8fff0fa9e712e24d4f977637c81cb 2025-10-10T00:35:35.9540853Z Successfully built torchvision 2025-10-10T00:35:36.0728113Z + for file in "${wheel_dir}"/*.whl 2025-10-10T00:35:36.0728690Z + pip_install_whl dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:35:36.0729303Z + args=('dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl') 2025-10-10T00:35:36.0729774Z + local args 2025-10-10T00:35:36.0730154Z + [[ dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-10-10T00:35:36.0730605Z + for path in "${args[@]}" 2025-10-10T00:35:36.0731034Z + echo 'Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl' 2025-10-10T00:35:36.0731637Z Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:35:36.0732840Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:35:36.4664827Z Processing ./dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:35:36.4748723Z Installing collected packages: torchvision 2025-10-10T00:35:36.8416348Z Successfully installed torchvision-0.22.0a0+966da7e 2025-10-10T00:35:36.8792624Z + '[' -n '' ']' 2025-10-10T00:35:36.8793177Z + test_python_shard 1 2025-10-10T00:35:36.8794732Z + [[ -z 6 ]] 2025-10-10T00:35:36.8796163Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --exclude-quantization-tests --shard 1 6 --verbose --upload-artifacts-while-running 2025-10-10T00:35:39.8960786Z Excluding inductor/test_max_autotune on ROCm 2025-10-10T00:35:39.8961540Z Excluding test_cuda_nvml_based_avail on ROCm 2025-10-10T00:35:39.8962172Z Excluding test_openreg on ROCm 2025-10-10T00:35:40.7683204Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-10-10T00:35:40.7772149Z Found test times from artifacts 2025-10-10T00:35:40.8158489Z Found test times from artifacts 2025-10-10T00:35:40.8167224Z Running all tests 2025-10-10T00:35:40.8614573Z Running parallel tests on 8 processes 2025-10-10T00:35:40.8615439Z Name: tests to run (est. time: 138.02min) 2025-10-10T00:35:40.8615990Z Serial tests (50): 2025-10-10T00:35:40.8616472Z inductor/test_flex_attention 1/6 2025-10-10T00:35:40.8617104Z inductor/test_flex_attention 2/6 2025-10-10T00:35:40.8617669Z inductor/test_flex_attention 3/6 2025-10-10T00:35:40.8618210Z inductor/test_flex_attention 4/6 2025-10-10T00:35:40.8618759Z inductor/test_flex_attention 5/6 2025-10-10T00:35:40.8619287Z inductor/test_flex_attention 6/6 2025-10-10T00:35:40.8619842Z inductor/test_distributed_patterns 1/1 2025-10-10T00:35:40.8620422Z dynamo/test_fake_distributed 1/1 2025-10-10T00:35:40.8621039Z inductor/test_benchmark_fusion 1/1 2025-10-10T00:35:40.8621630Z inductor/test_cutlass_backend 1/1 2025-10-10T00:35:40.8622145Z test_torch 1/1 2025-10-10T00:35:40.8622566Z test_fx 1/1 2025-10-10T00:35:40.8622985Z test_ci_sanity_check_fail 1/1 2025-10-10T00:35:40.8623511Z test_mobile_optimizer 1/1 2025-10-10T00:35:40.8624006Z test_overrides 1/1 2025-10-10T00:35:40.8624506Z distributions/test_distributions 1/1 2025-10-10T00:35:40.8625118Z test_multiprocessing_spawn 1/1 2025-10-10T00:35:40.8625651Z doctests 1/1 2025-10-10T00:35:40.8626083Z test_autoload_enable 1/1 2025-10-10T00:35:40.8626574Z test_reductions 1/1 2025-10-10T00:35:40.8627025Z test_fake_tensor 1/1 2025-10-10T00:35:40.8627464Z test_nn 1/1 2025-10-10T00:35:40.8627898Z test_privateuseone_python_backend 1/1 2025-10-10T00:35:40.8628454Z test_spectral_ops 1/1 2025-10-10T00:35:40.8628978Z functorch/test_memory_efficient_fusion 1/1 2025-10-10T00:35:40.8629559Z nn/test_convolution 1/1 2025-10-10T00:35:40.8630046Z nn/test_pooling 1/1 2025-10-10T00:35:40.8630480Z test_autocast 1/1 2025-10-10T00:35:40.8630923Z test_autograd_fallback 1/1 2025-10-10T00:35:40.8631422Z test_autoload_disable 1/1 2025-10-10T00:35:40.8631908Z test_cpp_api_parity 1/1 2025-10-10T00:35:40.8632404Z test_cpp_extensions_aot_ninja 1/1 2025-10-10T00:35:40.8632963Z test_cpp_extensions_aot_no_ninja 1/1 2025-10-10T00:35:40.8633515Z test_cpp_extensions_jit 1/1 2025-10-10T00:35:40.8634024Z test_cpp_extensions_mtia_backend 1/1 2025-10-10T00:35:40.8634838Z test_cpp_extensions_stream_and_event 1/1 2025-10-10T00:35:40.8635389Z test_cuda_primary_ctx 1/1 2025-10-10T00:35:40.8635863Z test_cuda_trace 1/1 2025-10-10T00:35:40.8636293Z test_dispatch 1/1 2025-10-10T00:35:40.8636728Z test_extension_utils 1/1 2025-10-10T00:35:40.8637198Z test_jit_disabled 1/1 2025-10-10T00:35:40.8637669Z test_multiprocessing 1/1 2025-10-10T00:35:40.8638170Z test_namedtuple_return_api 1/1 2025-10-10T00:35:40.8643074Z test_native_mha 1/1 2025-10-10T00:35:40.8643572Z test_python_dispatch 1/1 2025-10-10T00:35:40.8644044Z test_show_pickle 1/1 2025-10-10T00:35:40.8644509Z test_sort_and_select 1/1 2025-10-10T00:35:40.8644991Z test_tensor_creation_ops 1/1 2025-10-10T00:35:40.8645480Z test_tensorexpr 1/1 2025-10-10T00:35:40.8645908Z test_utils 1/1 2025-10-10T00:35:40.8646328Z Parallel tests (20): 2025-10-10T00:35:40.8646799Z inductor/test_triton_cpu_backend 1/1 2025-10-10T00:35:40.8647764Z dynamo/test_torchrec 1/1 2025-10-10T00:35:40.8648236Z test_ops 7/9 2025-10-10T00:35:40.8648677Z test_cuda_expandable_segments 1/1 2025-10-10T00:35:40.8649196Z test_decomp 2/17 2025-10-10T00:35:40.8649614Z test_decomp 3/17 2025-10-10T00:35:40.8650019Z test_decomp 14/17 2025-10-10T00:35:40.8650455Z test_decomp 15/17 2025-10-10T00:35:40.8650872Z test_jit_fuser_te 2/2 2025-10-10T00:35:40.8651326Z test_nestedtensor 1/3 2025-10-10T00:35:40.8651811Z profiler/test_execution_trace 1/1 2025-10-10T00:35:40.8652349Z profiler/test_record_function 1/1 2025-10-10T00:35:40.8652892Z test_sparse_semi_structured 1/1 2025-10-10T00:35:40.8653485Z functorch/test_aot_joint_with_descriptors 1/1 2025-10-10T00:35:40.8654110Z functorch/test_eager_transforms 1/1 2025-10-10T00:35:40.8654643Z functorch/test_vmap 1/1 2025-10-10T00:35:40.8655121Z functorch/test_control_flow 4/5 2025-10-10T00:35:40.8655637Z test_ops_gradients 2/3 2025-10-10T00:35:40.8656088Z test_ops_jit 1/2 2025-10-10T00:35:40.8656510Z xpu/test_conv 1/1 2025-10-10T00:35:40.8656970Z Name: excluded (est. time: 0.0min) 2025-10-10T00:35:40.8657461Z Serial tests (0): 2025-10-10T00:35:40.8657870Z Parallel tests (0): 2025-10-10T00:35:40.8658532Z Running inductor/test_flex_attention 1/6 ... [2025-10-10 00:35:40.861838] 2025-10-10T00:35:40.8659322Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T00:35:40.8661154Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=1', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:35:40.862153] 2025-10-10T00:44:54.3947328Z 2025-10-10T00:44:54.3949425Z inductor/test_flex_attention 1/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_1.6_72d7faae375d28e7_.log 2025-10-10T00:44:54.4062290Z Running 127 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_reduction_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_causal_block_paged_attention_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_device_cuda_1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_dynamic_divisibility_guards_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order3_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order1_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order4_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order0_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order3_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order4_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order0_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_force_write_lse_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_fw_bw_graph_correctness_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_inputs_are_realized_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_large_batch_heads_grid_dimension_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_load_from_bias_head_seq_batch_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_load_from_bias_seq_batch_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_lse_masked_output_backend_flex_attention_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_mask_mod_combiners_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_multiple_mask_calls_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_natten_2d_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_pow_2_headdim_head_dim_17_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__squared_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__identity_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__rel_bias_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s0_v_s0_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s0_v_s0_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s1_v_s1_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s2_v_s2_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_allocate_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod2_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod3_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod4_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod6_cuda_float32, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_backward_error_with_none_q_indices_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_device_change_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_viz_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE_32_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_broadcasted_head_block_mask_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_flex_attention_poison_mod_fwd_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_from_kv_blocks_without_q_computation_full_indices_True_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_init_mismatched_full_kv_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flex_attention_with_dynamic_max_autotune_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda 2025-10-10T00:44:54.4167243Z 2025-10-10T00:44:54.4167658Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T00:44:54.4168427Z Uploading artifacts took 0.00 seconds 2025-10-10T00:44:54.4169105Z Running inductor/test_flex_attention 2/6 ... [2025-10-10 00:44:54.395264] 2025-10-10T00:44:54.4169795Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T00:44:54.4171445Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=2', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:44:54.395874] 2025-10-10T00:55:11.7445780Z 2025-10-10T00:55:11.7447284Z inductor/test_flex_attention 2/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_2.6_e4a9944cd1228377_.log 2025-10-10T00:55:11.7566424Z Running 132 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_causal_mask_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_block_mask_non_divisible_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod6_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod7_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_causal_block_non_divisible_with_captured_buffer_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_cpu_error_message_return_lse_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_document_masking_edge_case_mode_aot_eager_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_document_masking_edge_case_mode_eager_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_dynamic_shapes_bug_dynamic_batch_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_epilogue_fused_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order0_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order1_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order2_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order3_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order3_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order0_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order0_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order2_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order4_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order4_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_fully_masked_out_rows_0_check_compile_True_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_function_composition_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kernel_options_argument_is_respected_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_lse_masked_output_backend_eager_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_modular_indexing_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_njt_causal_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_pow_2_headdim_head_dim_24_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_padded_dense_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__alibi_bias_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__rel_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_silu_on_score_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s0_v_s0_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s2_v_s2_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s3_v_s3_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s0_v_s0_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s1_v_s1_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s2_v_s2_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_subgraph_respect_decompostion_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_triton_template_warp_specialization_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod5_cuda_float32, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_attributes_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_operations_with_none_q_indices_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_vs_sequence_lengths_compile_True_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_compiling_create_block_mask_no_recompile_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_backprop_error_case_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda 2025-10-10T00:55:11.7680269Z 2025-10-10T00:55:11.7680681Z Running inductor/test_flex_attention 3/6 ... [2025-10-10 00:55:11.744849] 2025-10-10T00:55:11.7681847Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T00:55:11.7683735Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=3', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:11.745425] 2025-10-10T01:03:53.5129813Z 2025-10-10T01:03:53.5131381Z inductor/test_flex_attention 3/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_3.6_97ba324c51b5eccb_.log 2025-10-10T01:03:53.5224152Z Running 107 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod6_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_doc_mask_sparse_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_dynamic_shapes_with_max_autotune_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order0_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order4_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order1_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order2_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order2_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order0_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order1_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order1_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order1_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order1_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order2_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_float32_matmul_precision_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_fully_masked_out_rows_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_load_from_bias_seq_only_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_load_rel_bias_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_make_block_mask_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_max_autotune_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_multiple_score_mod_calls_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_contiguous_last_dim_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_recompile_changed_score_mod_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_reduction_unrolled_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__inverse_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_selective_ac_ops_to_save1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_skip_odd_keys_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s1_v_s1_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s0_v_s0_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s3_v_s3_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_zero_length_sequence_error_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_convert_mask_mod_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod5_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod6_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod7_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE5_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_create_is_cuda_graphable_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_from_kv_blocks_full_indices_True_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flex_attention_with_dynamic_max_autotune_graph_partition_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_bias_req_grad_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_inspect_bug_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda 2025-10-10T01:03:53.5314262Z 2025-10-10T01:03:53.5314703Z Running inductor/test_flex_attention 4/6 ... [2025-10-10 01:03:53.513199] 2025-10-10T01:03:53.5329466Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:03:53.5331496Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=4', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:53.513761] 2025-10-10T01:12:29.7127930Z 2025-10-10T01:12:29.7134622Z inductor/test_flex_attention 4/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_4.6_c75a5437926505b9_.log 2025-10-10T01:12:29.7230141Z Running 109 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod2_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod5_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_buffers_all_dims_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_score_mod_aot_eager_gradcheck_score_mod_name__head_offset_mode_eager_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_causal_block_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_debug_flag_disables_internal_compilation_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_eager_backward_strides_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order0_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order0_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order3_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order4_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order4_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_free_symbol_dynamic_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_function_composition_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_multiple_score_mod_calls_paged_attention_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_njt_causal_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod1_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims0_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod3_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod7_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_num_warps_8_error_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__inverse_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__squared_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__times_two_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_seq_masking_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s3_v_s3_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s0_v_s0_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s3_v_s3_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_symbol_closure_in_score_mod_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_convert_logical_block_mask_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod0_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod7_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_update_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE4_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE_256_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE_64_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_eager_tracing_correctness_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_default_cuda 2025-10-10T01:12:29.7324387Z 2025-10-10T01:12:29.7324847Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T01:12:29.7325688Z Uploading artifacts took 0.00 seconds 2025-10-10T01:12:29.7326452Z Running inductor/test_flex_attention 5/6 ... [2025-10-10 01:12:29.713156] 2025-10-10T01:12:29.7327247Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:12:29.7329099Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=5', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:29.713720] 2025-10-10T01:22:04.2132163Z 2025-10-10T01:22:04.2133821Z inductor/test_flex_attention 5/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_5.6_1a3101839e4c50f2_.log 2025-10-10T01:22:04.2251842Z Running 135 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod4_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod4_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_default_sparse_block_size_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_buffers_all_dims_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_scale_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_causal_block_non_divisible_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_custom_block_mask_generator_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_differentiable_logsumexp_compiled_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_differentiable_logsumexp_gradcheck_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_dynamic_shapes_with_custom_kernel_options_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order1_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order2_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order2_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order4_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order4_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order2_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order1_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order3_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_eager_permute_order4_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order2_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order2_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_inductor_permute_order3_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_index_multiple_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_index_weird2_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_invalid_block_size_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims0_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_multiple_score_mod_calls2_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_divisible_with_captured_buffer_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod0_head_dims0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod2_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_pow_2_headdim_head_dim_121_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_pow_2_headdim_head_dim_94_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_qkv_and_block_mask_on_the_same_device_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__identity_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__rel_bias_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__alibi_bias_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_selective_ac_ops_to_save2_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_skip_odd_keys_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_small_q_kv_len_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_backwards_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s1_v_s1_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s1_v_s1_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s2_v_s2_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s2_v_s2_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s2_v_s2_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s3_v_s3_do_s0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_tensor_subclass_dispatch_order_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_page_allocation_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod2_cuda_bfloat16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod4_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_mask_vs_sequence_lengths_compile_False_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_block_size_changes_BLOCK_SIZE_128_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_compiling_create_block_mask_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_flex_attention_poison_mod_bwd_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_flex_attention_poisoned_rel_logits_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_from_kv_blocks_full_indices_False_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_absolute_2d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda 2025-10-10T01:22:04.2360980Z 2025-10-10T01:22:04.2361351Z Running inductor/test_flex_attention 6/6 ... [2025-10-10 01:22:04.213601] 2025-10-10T01:22:04.2362072Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:22:04.2363725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_attention.py', '--shard-id=6', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:22:04.214197] 2025-10-10T01:32:23.7151330Z 2025-10-10T01:32:23.7152983Z inductor/test_flex_attention 6/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_attention_6.6_94a12aa05c24e37e_.log 2025-10-10T01:32:23.7273295Z Running 140 items in this shard: test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod3_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod4_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_aot_eager_gradcheck_score_mod5_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_autograd_function_in_score_mod_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_automatic_dynamic_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod0_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod1_BLOCK_SIZE_256_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod2_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod3_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod4_BLOCK_SIZE_256_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod5_BLOCK_SIZE_256_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod6_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_block_size_score_mod7_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_different_seqlen_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_dynamic_score_mask_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod0_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod5_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_score_mod7_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_builtin_score_mods_seqlen_lt_custom_sparse_block_size_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_cant_lower_error_message_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_buffers_all_dims_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_score_mod_aot_eager_gradcheck_score_mod_name__head_offset_mode_aot_eager_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_captured_wrong_device_error_message_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_custom_score_mod_layout_freeze_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_dependent_causal_bidirectional_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order1_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_eager_permute_order3_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_backward_stride_ordering_mode_inductor_permute_order0_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order3_shape0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_flex_attention_stride_ordering_mode_paged_attention_permute_order3_shape1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_fully_masked_out_rows_0_check_compile_False_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_function_composition_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_index_weird1_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims0_head_dims1_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims1_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_batch_dims2_head_dims1_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims0_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims0_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims0_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims1_head_dims1_score_mod7_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims0_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_kv_batch_broadcast_causal_mask_batch_dims2_head_dims1_score_mod6_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_load_from_view_buffer_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_correctness_score_mod1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_logsumexp_only_return_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_lse_masked_output_backend_flex_decode_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_max_autotune_with_captured_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_mixed_device_error_message_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_mixed_dtypes_fails_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_multiple_score_mod_calls2_paged_attention_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_new_empty_mask_mod_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_njt_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod4_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims0_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod5_head_dims1_cuda_bfloat16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_non_equal_head_dims_score_mod6_head_dims1_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__rel_causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux__times_two_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_aux_deprecation_warnings_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_return_max__causal_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_selective_ac_ops_to_save0_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_selective_ac_with_max_autotune_short_query_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_skip_odd_keys_cuda_float32, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_small_block_mask_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s0_k_s3_v_s3_do_s2_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_strided_inputs_q_s1_k_s1_v_s1_do_s1_cuda_float16, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_tma_with_customer_kernel_options_cuda, test/inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_validate_small_embedding_size_error_message_cuda, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod1_cuda_float32, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod3_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod4_cuda_float16, test/inductor/test_flex_attention.py::TestPagedAttentionCUDA::test_paged_builtin_score_mods_score_mod5_cuda_float16, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_doc_mask_clamped_repro_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_forward_pass_with_none_q_indices_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_from_kv_blocks_without_q_computation_full_indices_False_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_getitem_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_init_mismatched_full_q_cuda, test/inductor/test_flex_attention.py::TestBlockMaskCUDA::test_upcast_appropriately_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_batch_head_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_comparison_vs_sdpa_with_learnable_bias_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_distinct_biases_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_flipped_indexed_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_global_tokens_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_head_specific_gate_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_indirect_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_learnable_bias_global_compiled_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_local_window_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_multiplicative_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_relative_1d_bias_only_grad_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:256_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float16_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_default_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_symmetric_bias_batch:2_head:4_seq_len:37_headdim:16_dtype:float32_mode_max-autotune-no-cudagraphs_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:bfloat16_cuda, test/inductor/test_flex_attention.py::TestLearnableBiasesCUDA::test_weird_bias_batch:2_head:4_seq_len:277_headdim:16_dtype:float16_cuda 2025-10-10T01:32:23.7392679Z 2025-10-10T01:32:23.7393150Z Running inductor/test_distributed_patterns 1/1 ... [2025-10-10 01:32:23.715754] 2025-10-10T01:32:23.7393985Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:32:23.7395957Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_distributed_patterns.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:23.716306] 2025-10-10T01:32:50.6003108Z 2025-10-10T01:32:50.6004624Z inductor/test_distributed_patterns 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_distributed_patterns_1.1_7b1dca74f262abcb_.log 2025-10-10T01:32:50.6022322Z Running 20 items in this shard: test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_aot_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_nested_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_aot, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_multi_layers, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return3, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return4, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter2 2025-10-10T01:32:50.6037477Z 2025-10-10T01:32:50.6038525Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T01:32:50.6039543Z Uploading artifacts took 0.00 seconds 2025-10-10T01:32:50.6040346Z Running dynamo/test_fake_distributed 1/1 ... [2025-10-10 01:32:50.600377] 2025-10-10T01:32:50.6041131Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:32:50.6042990Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fake_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:50.600979] 2025-10-10T01:32:59.0872582Z 2025-10-10T01:32:59.0874520Z dynamo/test_fake_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fake_distributed_1.1_b6249398699382cf_.log 2025-10-10T01:32:59.0877079Z Running 2 items in this shard: test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_all_to_all_single_autograd, test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_device_mesh_get_local_rank 2025-10-10T01:32:59.0878829Z 2025-10-10T01:32:59.0881716Z Running inductor/test_benchmark_fusion 1/1 ... [2025-10-10 01:32:59.087431] 2025-10-10T01:32:59.0882587Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:32:59.0884490Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_benchmark_fusion.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:59.088000] 2025-10-10T01:33:37.6985166Z 2025-10-10T01:33:37.6987009Z inductor/test_benchmark_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_benchmark_fusion_1.1_88320cbb0d7b4ce6_.log 2025-10-10T01:33:37.7001162Z Running 16 items in this shard: test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_avoid_register_spilling_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_foreach_kernel_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_register_spills_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_resnet18_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_softmax_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCudaTest::test_tield_kernel_fusion_cuda, test/inductor/test_benchmark_fusion.py::BenchmarkingTest::test_benchmark_on_non_zero_device, test/inductor/test_benchmark_fusion.py::BenchmarkMultiTemplateFusionCudaTest::test_changed_layout, test/inductor/test_benchmark_fusion.py::BenchmarkMultiTemplateFusionCudaTest::test_equivalent_extern_code, test/inductor/test_benchmark_fusion.py::BenchmarkMultiTemplateFusionCudaTest::test_equivalent_template_code, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_avoid_register_spilling_cpu, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_foreach_kernel_cpu, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_register_spills_cpu, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_resnet18_cpu, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_softmax_cpu, test/inductor/test_benchmark_fusion.py::BenchmarkFusionCpuTest::test_tield_kernel_fusion_cpu 2025-10-10T01:33:37.7012149Z 2025-10-10T01:33:37.7012561Z Running inductor/test_cutlass_backend 1/1 ... [2025-10-10 01:33:37.698666] 2025-10-10T01:33:37.7014291Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:33:37.7016306Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cutlass_backend.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:37.699212] 2025-10-10T01:33:44.2315769Z 2025-10-10T01:33:44.2317380Z inductor/test_cutlass_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cutlass_backend_1.1_644672d1b8ed5dce_.log 2025-10-10T01:33:44.2452037Z Running 180 items in this shard: test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_aoti_workspace_ptr, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_check_paths, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_compilation_time_use_aoti_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_compilation_time_use_aoti_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_config_number_post_filtering, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_fp8_scaled_mm_fast_accum_filtering, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_integration, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_matmul_nonzero_offset, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_matmul_same_tensor, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_op_allowlist, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_op_denylist, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_shape_coverage_mm, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_subproc_addmm_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_subproc_addmm_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_subproc_bmm, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_backend_subproc_mm, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_cutlass_key, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_diff_matmul_share_same_kernel_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_diff_matmul_share_same_kernel_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_activations_exp, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_activations_relu, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_activations_sigmoid, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_activations_tanh, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_broadcasting_add, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_broadcasting_div, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_broadcasting_mul, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_broadcasting_sub, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_flexible_layout, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_add_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_add_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_add_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_add_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_div_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_div_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_div_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_div_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_exp_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_exp_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_exp_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_exp_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_mul_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_mul_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_mul_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_mul_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_relu_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_relu_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_relu_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_relu_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sigmoid_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sigmoid_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sigmoid_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sigmoid_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sub_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sub_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sub_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_sub_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_tanh_shape0, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_tanh_shape1, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_tanh_shape2, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_fusions_basic_tanh_shape3, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_add, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_div, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_exp, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_mul, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_relu, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_sigmoid, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_sub, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_mixed_dtypes_tanh, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_add, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_div, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_exp, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_mul, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_relu, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_sigmoid, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_sub, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_op_tanh, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_add_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_add_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_div_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_div_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_exp_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_exp_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_mul_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_mul_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_relu_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_relu_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_sigmoid_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_sigmoid_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_sub_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_sub_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_tanh_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_multi_output_tanh_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_return_accumulator, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_add, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_div, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_exp, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_mul, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_relu, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_sigmoid, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_sub, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_evt_reuse_matmul_input_tanh, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_filtered_ops_cache, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_flexible_layout, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_force_cutlass_backend_aoti_cexpr_codegen, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_force_cutlass_backend_aoti_dynamic, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_multiple_linear_float8_e4m3fn_shape0_use_fast_accum_True_use_aoti_False_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_multiple_linear_float8_e4m3fn_shape0_use_fast_accum_True_use_aoti_False_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_multiple_linear_float8_e4m3fn_shape0_use_fast_accum_True_use_aoti_True_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_rowwise_scaling_multiple_linear_float8_e4m3fn_shape0_use_fast_accum_True_use_aoti_True_dynamic_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_tensorwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_tensorwise_scaling_float8_e4m3fn_shape0_has_bias_False_use_fast_accum_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_tensorwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_fp8_tensorwise_scaling_float8_e4m3fn_shape0_has_bias_True_use_fast_accum_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_gemm_operation_serialization_arch_100_cuda_version_12_4, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_gemm_operation_serialization_arch_100_cuda_version_12_8, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_gemm_operation_serialization_arch_90_cuda_version_12_4, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_gemm_operation_serialization_arch_90_cuda_version_12_8, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_get_max_alignment, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_import_cutlass, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_False_use_aoti_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_False_use_aoti_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_False_use_aoti_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_False_use_aoti_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_True_use_aoti_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_True_use_aoti_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_True_use_aoti_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_addmm_dynamic_True_use_aoti_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_False_bfloat16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_False_bfloat16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_False_float16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_False_float16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_True_bfloat16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_True_bfloat16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_True_float16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_False_use_aoti_True_float16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_False_bfloat16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_False_bfloat16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_False_float16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_False_float16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_True_bfloat16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_True_bfloat16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_True_float16_use_expand_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_bmm_dynamic_True_use_aoti_True_float16_use_expand_True, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_chained_fusion_fp16_fp32acc, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_fp8_scaled_mm_dynamic_False_use_aoti_False_float8_e4m3fn, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_fp8_scaled_mm_dynamic_False_use_aoti_True_float8_e4m3fn, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_fp8_scaled_mm_dynamic_True_use_aoti_False_float8_e4m3fn, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_fp8_scaled_mm_dynamic_True_use_aoti_True_float8_e4m3fn, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_int_mm_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_no_fusion_dtype_mismatch, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_False_use_aoti_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_False_use_aoti_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_False_use_aoti_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_False_use_aoti_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_True_use_aoti_False_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_True_use_aoti_False_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_True_use_aoti_True_bfloat16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_dynamic_True_use_aoti_True_float16, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_regular_mm_streamk, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_relu6_fusion_fp16_fp32acc, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_relu_fusion_fp16_fp32acc, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_shape_dependent_normalization_fusion, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_simple_fusion_fp16_fp32acc, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_backend_sparse_semi_structured_mm_dynamic_False, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_max_autotune_cutlass_threshold, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_maybe_append_choice_caching, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_multiple_mm, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_multiple_mm_with_dynamic_shape, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_number_mm_precompiles, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_standalone_runner, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_streamk_with_dynamic, test/inductor/test_cutlass_backend.py::TestCutlassBackend::test_streamk_with_static 2025-10-10T01:33:44.2570673Z 2025-10-10T01:33:44.2570940Z Running test_torch 1/1 ... [2025-10-10 01:33:44.232264] 2025-10-10T01:33:44.2571507Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:33:44.2573022Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_torch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:44.232815] 2025-10-10T01:37:28.8016762Z 2025-10-10T01:37:28.8018166Z test_torch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_torch_1.1_25e9f96f71824599_.log 2025-10-10T01:37:28.8479939Z Running 975 items in this shard: test/test_torch.py::TestBasicVitalSigns::test_basic_vitals, test/test_torch.py::TestBasicVitalSigns::test_basic_vitals_read_write, test/test_torch.py::TestBasicVitalSigns::test_dataloader_vitals, test/test_torch.py::TestTorch::test_RNGState, test/test_torch.py::TestTorch::test_RNGStateAliasing, test/test_torch.py::TestTorch::test_RNG_after_pickle, test/test_torch.py::TestTorch::test_Size, test/test_torch.py::TestTorch::test_Size_concat_non_tuple_sequence, test/test_torch.py::TestTorch::test_Size_concat_wildcard, test/test_torch.py::TestTorch::test_Size_iter, test/test_torch.py::TestTorch::test_Size_scalar, test/test_torch.py::TestTorch::test_add_meta_scalar, test/test_torch.py::TestTorch::test_allow_tensor_metadata_change, test/test_torch.py::TestTorch::test_apply, test/test_torch.py::TestTorch::test_as_subclass, test/test_torch.py::TestTorch::test_assert_async, test/test_torch.py::TestTorch::test_backward_hooks_traverse, test/test_torch.py::TestTorch::test_batch_norm_cpu_inference, test/test_torch.py::TestTorch::test_bf16_supported_on_cpu, test/test_torch.py::TestTorch::test_bmm_multithreaded, test/test_torch.py::TestTorch::test_boxMullerState, test/test_torch.py::TestTorch::test_cat_neg_dim, test/test_torch.py::TestTorch::test_check, test/test_torch.py::TestTorch::test_chunk_neg_dim, test/test_torch.py::TestTorch::test_conj_neg_tolist, test/test_torch.py::TestTorch::test_conj_physical_meta_stride, test/test_torch.py::TestTorch::test_contains, test/test_torch.py::TestTorch::test_copy_broadcast, test/test_torch.py::TestTorch::test_copy_dtypes, test/test_torch.py::TestTorch::test_copy_float16, test/test_torch.py::TestTorch::test_copy_many_to_one, test/test_torch.py::TestTorch::test_copy_transpose, test/test_torch.py::TestTorch::test_cuda_not_built, test/test_torch.py::TestTorch::test_cummax_neg_dim, test/test_torch.py::TestTorch::test_cummin_neg_dim, test/test_torch.py::TestTorch::test_cumprod_neg_dim, test/test_torch.py::TestTorch::test_cumsum_neg_dim, test/test_torch.py::TestTorch::test_cxx_flags, test/test_torch.py::TestTorch::test_data_ptr_of_empty_tensor_with_storage, test/test_torch.py::TestTorch::test_data_ptr_of_empty_view_with_storage, test/test_torch.py::TestTorch::test_deepcopy_gradient, test/test_torch.py::TestTorch::test_deepcopy_parameter, test/test_torch.py::TestTorch::test_deterministic_fill_uninitialized_memory, test/test_torch.py::TestTorch::test_deterministic_flag, test/test_torch.py::TestTorch::test_device, test/test_torch.py::TestTorch::test_dim_order, test/test_torch.py::TestTorch::test_dir, test/test_torch.py::TestTorch::test_doc, test/test_torch.py::TestTorch::test_doc_template, test/test_torch.py::TestTorch::test_dot_data_use, test/test_torch.py::TestTorch::test_dtype_is_signed, test/test_torch.py::TestTorch::test_element_size, test/test_torch.py::TestTorch::test_empty_meta, test/test_torch.py::TestTorch::test_empty_storage_view, test/test_torch.py::TestTorch::test_equal, test/test_torch.py::TestTorch::test_error_msg_type_translation, test/test_torch.py::TestTorch::test_fill_diagonal, test/test_torch.py::TestTorch::test_format_scalar_meta, test/test_torch.py::TestTorch::test_from_buffer, test/test_torch.py::TestTorch::test_from_file, test/test_torch.py::TestTorch::test_gather_neg_dim, test/test_torch.py::TestTorch::test_generator_cpu, test/test_torch.py::TestTorch::test_get_cpu_capability, test/test_torch.py::TestTorch::test_has_internal_overlap, test/test_torch.py::TestTorch::test_has_storage, test/test_torch.py::TestTorch::test_index_add, test/test_torch.py::TestTorch::test_index_add_all_dtypes, test/test_torch.py::TestTorch::test_index_add_cornercase, test/test_torch.py::TestTorch::test_index_add_correctness, test/test_torch.py::TestTorch::test_index_add_neg_dim, test/test_torch.py::TestTorch::test_index_copy_neg_dim, test/test_torch.py::TestTorch::test_index_fill_neg_dim, test/test_torch.py::TestTorch::test_index_select_neg_dim, test/test_torch.py::TestTorch::test_invalid_arg_error_handling, test/test_torch.py::TestTorch::test_invalid_generator_raises, test/test_torch.py::TestTorch::test_is_nonzero, test/test_torch.py::TestTorch::test_is_same_size, test/test_torch.py::TestTorch::test_iter, test/test_torch.py::TestTorch::test_kthvalue_neg_dim, test/test_torch.py::TestTorch::test_linspace_logspace, test/test_torch.py::TestTorch::test_logcumsumexp_neg_dim, test/test_torch.py::TestTorch::test_manual_seed, test/test_torch.py::TestTorch::test_map, test/test_torch.py::TestTorch::test_map2, test/test_torch.py::TestTorch::test_max_neg_dim, test/test_torch.py::TestTorch::test_mean_neg_dim, test/test_torch.py::TestTorch::test_median_neg_dim, test/test_torch.py::TestTorch::test_memory_format, test/test_torch.py::TestTorch::test_memory_format_contiguous_returns_same_tensor_if_already_satisfies, test/test_torch.py::TestTorch::test_memory_format_empty, test/test_torch.py::TestTorch::test_min_neg_dim, test/test_torch.py::TestTorch::test_mode_neg_dim, test/test_torch.py::TestTorch::test_multinomial_invalid_probs, test/test_torch.py::TestTorch::test_nanmedian_neg_dim, test/test_torch.py::TestTorch::test_narrow_neg_dim, test/test_torch.py::TestTorch::test_nbytes, test/test_torch.py::TestTorch::test_ndim, test/test_torch.py::TestTorch::test_new, test/test_torch.py::TestTorch::test_newaxis_numpy_comparison, test/test_torch.py::TestTorch::test_newindex, test/test_torch.py::TestTorch::test_no_cuda_monkeypatch, test/test_torch.py::TestTorch::test_norm_neg_dim, test/test_torch.py::TestTorch::test_normal_shape, test/test_torch.py::TestTorch::test_numel, test/test_torch.py::TestTorch::test_parallel_info, test/test_torch.py::TestTorch::test_parsing_double, test/test_torch.py::TestTorch::test_parsing_int64, test/test_torch.py::TestTorch::test_parsing_intlist, test/test_torch.py::TestTorch::test_permute, test/test_torch.py::TestTorch::test_pickle, test/test_torch.py::TestTorch::test_pickle_dtype, test/test_torch.py::TestTorch::test_pickle_function, test/test_torch.py::TestTorch::test_pickle_generator, test/test_torch.py::TestTorch::test_pickle_parameter, test/test_torch.py::TestTorch::test_pickle_parameter_no_requires_grad, test/test_torch.py::TestTorch::test_pickle_size, test/test_torch.py::TestTorch::test_pin_memory, test/test_torch.py::TestTorch::test_print, test/test_torch.py::TestTorch::test_prod_neg_dim, test/test_torch.py::TestTorch::test_pyobj_preserved, test/test_torch.py::TestTorch::test_qengine, test/test_torch.py::TestTorch::test_renorm_neg_dim, test/test_torch.py::TestTorch::test_resizable, test/test_torch.py::TestTorch::test_reversed, test/test_torch.py::TestTorch::test_scatter_neg_dim, test/test_torch.py::TestTorch::test_select_neg_dim, test/test_torch.py::TestTorch::test_set_flush_denormal, test/test_torch.py::TestTorch::test_setting_real_imag_to_a_number, test/test_torch.py::TestTorch::test_show_config, test/test_torch.py::TestTorch::test_size_neg_dim, test/test_torch.py::TestTorch::test_size_stride, test/test_torch.py::TestTorch::test_sizeof, test/test_torch.py::TestTorch::test_slice, test/test_torch.py::TestTorch::test_slow_test, test/test_torch.py::TestTorch::test_sobolengine_bounds, test/test_torch.py::TestTorch::test_sobolengine_bounds_scrambled, test/test_torch.py::TestTorch::test_sobolengine_continuing, test/test_torch.py::TestTorch::test_sobolengine_continuing_scrambled, test/test_torch.py::TestTorch::test_sobolengine_default_dtype, test/test_torch.py::TestTorch::test_sobolengine_distribution, test/test_torch.py::TestTorch::test_sobolengine_distribution_scrambled, test/test_torch.py::TestTorch::test_sobolengine_draw, test/test_torch.py::TestTorch::test_sobolengine_draw_base2, test/test_torch.py::TestTorch::test_sobolengine_draw_base2_scrambled, test/test_torch.py::TestTorch::test_sobolengine_draw_scrambled, test/test_torch.py::TestTorch::test_sobolengine_fast_forward, test/test_torch.py::TestTorch::test_sobolengine_fast_forward_scrambled, test/test_torch.py::TestTorch::test_sobolengine_first_point, test/test_torch.py::TestTorch::test_sobolengine_high_dim, test/test_torch.py::TestTorch::test_sobolengine_raise, test/test_torch.py::TestTorch::test_sobolengine_reset, test/test_torch.py::TestTorch::test_sobolengine_reset_scrambled, test/test_torch.py::TestTorch::test_sort_neg_dim, test/test_torch.py::TestTorch::test_split_neg_dim, test/test_torch.py::TestTorch::test_split_with_sizes_copy_out, test/test_torch.py::TestTorch::test_squeeze_neg_dim, test/test_torch.py::TestTorch::test_std_neg_dim, test/test_torch.py::TestTorch::test_storage_base_init, test/test_torch.py::TestTorch::test_storage_base_new, test/test_torch.py::TestTorch::test_storage_byteswap, test/test_torch.py::TestTorch::test_storage_casts, test/test_torch.py::TestTorch::test_storage_cycle_via_dict, test/test_torch.py::TestTorch::test_storage_cycle_via_slots, test/test_torch.py::TestTorch::test_storage_dead_weak_ref, test/test_torch.py::TestTorch::test_storage_dealloc, test/test_torch.py::TestTorch::test_storage_dealloc_resurrected, test/test_torch.py::TestTorch::test_storage_dealloc_subclass_resurrected, test/test_torch.py::TestTorch::test_storage_dealloc_subclass_zombie, test/test_torch.py::TestTorch::test_storage_dict_dealloc, test/test_torch.py::TestTorch::test_storage_error, test/test_torch.py::TestTorch::test_storage_error_no_attribute, test/test_torch.py::TestTorch::test_storage_finalizer_dealloc, test/test_torch.py::TestTorch::test_storage_fix_weakref_no_leak, test/test_torch.py::TestTorch::test_storage_from_tensor_dealloc, test/test_torch.py::TestTorch::test_storage_from_tensor_dealloc_resurrected, test/test_torch.py::TestTorch::test_storage_from_tensor_dealloc_zombie, test/test_torch.py::TestTorch::test_storage_preserve_nonhermetic_in_hermetic_context, test/test_torch.py::TestTorch::test_storage_resurrected_weak_ref, test/test_torch.py::TestTorch::test_storage_slot_dealloc, test/test_torch.py::TestTorch::test_storage_weakref_dealloc, test/test_torch.py::TestTorch::test_structseq_repr, test/test_torch.py::TestTorch::test_subclass_preserved, test/test_torch.py::TestTorch::test_subclass_tensors, test/test_torch.py::TestTorch::test_sum_neg_dim, test/test_torch.py::TestTorch::test_swap_basic, test/test_torch.py::TestTorch::test_swap_fail_slots, test/test_torch.py::TestTorch::test_t_not_2d_error, test/test_torch.py::TestTorch::test_tensor_base_init, test/test_torch.py::TestTorch::test_tensor_base_new, test/test_torch.py::TestTorch::test_tensor_ctor_scalar, test/test_torch.py::TestTorch::test_tensor_cycle_via_dict, test/test_torch.py::TestTorch::test_tensor_cycle_via_slots, test/test_torch.py::TestTorch::test_tensor_dead_weak_ref, test/test_torch.py::TestTorch::test_tensor_dict_dealloc, test/test_torch.py::TestTorch::test_tensor_finalizer_dealloc, test/test_torch.py::TestTorch::test_tensor_fix_weakref_no_leak, test/test_torch.py::TestTorch::test_tensor_item_no_warning, test/test_torch.py::TestTorch::test_tensor_ressurecting_clear, test/test_torch.py::TestTorch::test_tensor_resurrected_weak_ref, test/test_torch.py::TestTorch::test_tensor_set, test/test_torch.py::TestTorch::test_tensor_set_errors, test/test_torch.py::TestTorch::test_tensor_slot_dealloc, test/test_torch.py::TestTorch::test_tensor_weakref_dealloc, test/test_torch.py::TestTorch::test_tensor_where_scalar, test/test_torch.py::TestTorch::test_tensor_with_grad_to_scalar_warning, test/test_torch.py::TestTorch::test_tensoriterator_output_setup, test/test_torch.py::TestTorch::test_terminate_handler_on_crash, test/test_torch.py::TestTorch::test_to, test/test_torch.py::TestTorch::test_to_with_tensor, test/test_torch.py::TestTorch::test_topk_neg_dim, test/test_torch.py::TestTorch::test_torch_from_file, test/test_torch.py::TestTorch::test_transpose_neg_dim, test/test_torch.py::TestTorch::test_type, test/test_torch.py::TestTorch::test_type_alias, test/test_torch.py::TestTorch::test_type_conversion_via_dtype_name, test/test_torch.py::TestTorch::test_typed_storage_deprecation_warning, test/test_torch.py::TestTorch::test_typed_storage_internal_no_warning, test/test_torch.py::TestTorch::test_unbind_neg_dim, test/test_torch.py::TestTorch::test_unflatten, test/test_torch.py::TestTorch::test_unfold_neg_dim, test/test_torch.py::TestTorch::test_unsqueeze_neg_dim, test/test_torch.py::TestTorch::test_upsample_nearest1d_meta, test/test_torch.py::TestTorch::test_upsample_nearest2d_meta, test/test_torch.py::TestTorch::test_var_neg_dim, test/test_torch.py::TestTorch::test_warn_types, test/test_torch.py::TestTorch::test_wildcard_import, test/test_torch.py::TestVitalSignsCudaCUDA::test_cuda_vitals_gpu_only_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test__local_scalar_dense_with_empty_tensor_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcdiv_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_cuda_errors_with_cpu_scalars_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_False_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_addcmul_use_cpu_scalar_True_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_assertRaisesRegex_ignore_msg_non_native_device_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_edge_cases_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_edge_cases_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_edge_cases_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_p_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_p_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_p_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bernoulli_self_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bfloat16_neg_abs_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bool_tensor_value_change_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_add_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_addcdiv_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_addcmul_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_atan2_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_copy_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_dist_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_div_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_eq_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_fmod_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_ge_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_gt_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_le_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_lerp_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_lt_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_map2_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_map_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_masked_fill_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_masked_scatter_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_masked_select_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_max_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_min_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_mul_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_ne_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_pow_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_remainder_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_broadcast_fn_sub_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_bytes_to_scalar_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_kstest_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_no_inf_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cauchy_no_inf_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_cuda_backward_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_euclidean_large_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_grad_p_lt_1_no_nan_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_large_batch_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_large_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_non_contiguous_batch_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_non_contiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_norm_batch_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_norm_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cdist_same_inputs_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_check_tensor_all_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_check_tensor_internal_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_clone_all_dtypes_and_devices_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_clone_not_memory_dense_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_clone_zero_stride_dim_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_complex_half_experimental_warning_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_constants_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_conv_transposed_backward_agnostic_to_memory_format_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_conv_transposed_large_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_complex32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy__cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_all_dtypes_and_devices_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_math_view_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_mem_overlap_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_transpose_math_view_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_transpose_math_view_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_copy_transpose_math_view_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_corrcoef_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_corrcoef_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_corrcoef_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cov_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cpp_warnings_have_python_context_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cummax_cummin_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cummax_discontiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cummin_discontiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cumprod_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cumsum_64bit_indexing_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_cumsum_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deepcopy_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deepcopy_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deepcopy_scalar_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deepcopy_scalar_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_cumsum_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_complex32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_empty_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_interpolate_bilinear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_replication_pad2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_resize_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_device_guard_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_diff_noncontig_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_dim_function_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_discontiguous_out_cumsum_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_dist_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_dtypetensor_warnings_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_expected_failure_xla_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_kstest_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_kstest_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_kstest_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_kstest_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_no_zero_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_exponential_no_zero_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gather_backward_deterministic_path_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gather_backward_one_dim_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_geometric_kstest_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scale_will_not_overflow_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaler_deprecated_warning_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaler_pass_itself_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_accumulation_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach0_fused0_AdamW_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach0_fused0_Adam_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach0_fused0_SGD_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach2_fused_True_AdamW_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach2_fused_True_Adam_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach2_fused_True_SGD_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach_True_fused1_AdamW_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach_True_fused1_Adam_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_autocast_foreach_True_fused1_SGD_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_clipping_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_clipping_separate_unscale_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_multiple_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_penalty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_state_dict_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_unscale_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_unscale_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_unscale_sparse_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_grad_scaling_update_scale_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_all_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_all_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_all_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_extreme_cases_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_extreme_cases_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_extreme_cases_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_spacing_list_length_error_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_spacing_list_length_error_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_spacing_list_length_error_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_gradient_type_promotion_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_hook_remove_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_index_add_large_inputs_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_index_add_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_index_copy_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_index_fill_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_index_put_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_int64_upsample3d_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_invalid_shapes_grid_sampler_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_is_set_to_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_is_signed_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_complex32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float8_e4m3fn, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float8_e4m3fnuz, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float8_e5m2, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_float8_e5m2fnuz, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_item_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_large_cumprod_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_large_cumsum_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_binary_op_no_materialize_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lazy_clone_view_materialize_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_log_normal_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_log_normal_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_log_normal_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_log_normal_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_logcumsumexp_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_lognormal_kstest_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_bool_tensor_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_bfloat16_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_bfloat16_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_bool_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_bool_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_complex128_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_complex128_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_complex64_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_complex64_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float16_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float16_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float32_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float32_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float64_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_float64_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int16_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int16_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int32_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int32_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int64_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int64_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int8_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_int8_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_uint8_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_cuda_uint8_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_fill_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_bool_tensor_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_inplace_noncontiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_large_tensor_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_scatter_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_masked_select_discontiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_clone_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_consistency_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_cpu_and_cuda_ops_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_empty_like_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_factory_like_functions_preserve_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_operators_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_preserved_after_permute_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_propagation_rules_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_to_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_type_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_memory_format_type_shortcuts_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_module_share_memory_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cpu_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cpu_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cpu_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_deterministic_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_deterministic_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_deterministic_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_device_constrain_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_empty_w_replacement_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_empty_wo_replacement_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_gpu_device_constrain_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_multinomial_rng_state_advance_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_narrow_copy_non_contiguous_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_narrow_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_no_nondeterministic_alert_interpolate_bilinear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_no_nondeterministic_alert_interpolate_trilinear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_AdaptiveAvgPool2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_AdaptiveAvgPool3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_AdaptiveMaxPool2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_AvgPool3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_CTCLoss_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_EmbeddingBag_max_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_FractionalMaxPool2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_FractionalMaxPool3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxPool3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool1d_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool1d_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool1d_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool2d_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool2d_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool2d_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool3d_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool3d_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_MaxUnpool3d_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_NLLLoss_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_ReflectionPad1d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_ReflectionPad3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_ReplicationPad1d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_ReplicationPad2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_ReplicationPad3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_bincount_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_grid_sample_2d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_grid_sample_3d_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_histc_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_histc_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_interpolate_bicubic_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_interpolate_bilinear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_interpolate_linear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_interpolate_trilinear_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_kthvalue_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_median_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_put_accumulate_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_alert_put_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_resize_quantized_cuda_qint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_resize_quantized_cuda_qint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_resize_quantized_cuda_quint2x4, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_resize_quantized_cuda_quint4x2, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nondeterministic_resize_quantized_cuda_quint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_normal_kstest_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_normal_kstest_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_normal_kstest_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_normal_kstest_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_nullary_op_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_pairwise_distance_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_parallel_cow_materialize_error_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_and_graph_partition_AdamW_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_and_graph_partition_Adam_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_and_graph_partition_SGD_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_between_unscale_and_step_AdamW_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_between_unscale_and_step_Adam_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_params_invalidated_with_grads_invalidated_between_unscale_and_step_SGD_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_pdist_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_pdist_norm_large_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_pickle_gradscaler_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_pin_memory_from_constructor_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_accumulate_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_put_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_reduced_type_float_copy_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_reduced_type_float_copy_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_repeat_interleave_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scalar_check_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_add_bool_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_add_non_unique_index_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_add_one_dim_deterministic_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_add_to_large_input_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_bool_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_multiply_unsupported_dtypes_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_multiply_unsupported_dtypes_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_non_unique_index_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_operations_to_large_input_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_reduce_scalar_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_to_large_input_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_scatter_zero_size_index_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_serialization_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_default_tensor_type_warnings_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_set_storage_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_shift_mem_overlap_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_skip_xla_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_all_devices_non_blocking_False_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_all_devices_non_blocking_True_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_errors_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_from_tensor_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_meta_ok_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_qint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_qint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_quint4x2, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_quint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_setitem_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_storage_use_count_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_strides_propagation_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_sync_warning_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_take_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_uint16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_uint32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_uint64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_from_storage_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_set_errors_multigpu_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_shape_empty_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_storage_type_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_tensor_type_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_ternary_op_mem_overlap_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_bool, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_complex128, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_complex64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_int16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_int32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_int64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_int8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_typed_storage_meta_cuda_uint8, test/test_torch.py::TestTorchDeviceTypeCUDA::test_uniform_kstest_cuda_bfloat16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_uniform_kstest_cuda_float16, test/test_torch.py::TestTorchDeviceTypeCUDA::test_uniform_kstest_cuda_float32, test/test_torch.py::TestTorchDeviceTypeCUDA::test_uniform_kstest_cuda_float64, test/test_torch.py::TestTorchDeviceTypeCUDA::test_untyped_storage_meta_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_warn_always_caught_cuda, test/test_torch.py::TestTorchDeviceTypeCUDA::test_where_scalar_handcrafted_values_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_advancedindex_mixed_cpu_devices_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_advancedindex_mixed_devices_error_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_clamp_cuda_float32, test/test_torch.py::TestDevicePrecisionCUDA::test_clamp_cuda_float64, test/test_torch.py::TestDevicePrecisionCUDA::test_clamp_cuda_int64, test/test_torch.py::TestDevicePrecisionCUDA::test_copy_broadcast_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_copy_noncontig_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_cuda_device_idx_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_device_serialization_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_float16, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_float32, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_float64, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_int16, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_int32, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_int64, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_int8, test/test_torch.py::TestDevicePrecisionCUDA::test_from_sequence_cuda_uint8, test/test_torch.py::TestDevicePrecisionCUDA::test_index_add_bfloat16_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_multidevice_serialization_cuda, test/test_torch.py::TestDevicePrecisionCUDA::test_type_conversions_same_device_cuda 2025-10-10T01:37:28.8703355Z 2025-10-10T01:37:28.8703472Z Running test_fx 1/1 ... [2025-10-10 01:37:28.804161] 2025-10-10T01:37:28.8703752Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:37:28.8704456Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:37:28.804719] 2025-10-10T01:41:04.9869573Z 2025-10-10T01:41:04.9875406Z test_fx 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_1.1_e7dcd60d8c06b885_.log 2025-10-10T01:41:05.0522103Z Running 1269 items in this shard: test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationInput_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationInput_cuda, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationMetadata_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationMetadata_cuda, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationTorchTensorCall_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_MutationTorchTensorCall_cuda, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_Mutation_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_Mutation_cuda, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_ReturnList_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_ReturnList_cuda, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_TakeList_cpu, test/test_fx.py::TestCommonPass::test_correctness_CSEPass_TakeList_cuda, test/test_fx.py::TestCommonPass::test_correctness_factory_CSEPass_FactoryFunctionCall_cpu, test/test_fx.py::TestCommonPass::test_correctness_factory_CSEPass_FactoryFunctionCall_cuda, test/test_fx.py::TestCommonPass::test_correctness_factory_CSEPass_MutationFactory_cpu, test/test_fx.py::TestCommonPass::test_correctness_factory_CSEPass_MutationFactory_cuda, test/test_fx.py::TestCSEPass::test_banned_list, test/test_fx.py::TestCSEPass::test_empty, test/test_fx.py::TestCSEPass::test_immutable_list_multiple_entries, test/test_fx.py::TestCSEPass::test_immutable_list_type, test/test_fx.py::TestCSEPass::test_kwarg, test/test_fx.py::TestCSEPass::test_nested_immutable_list_type, test/test_fx.py::TestCSEPass::test_nochange, test/test_fx.py::TestCSEPass::test_rand_like, test/test_fx.py::TestCSEPass::test_rand_n, test/test_fx.py::TestCSEPass::test_random, test/test_fx.py::TestCSEPass::test_simple, test/test_fx.py::TestCSEPass::test_simple_2, test/test_fx.py::TestCSEPass::test_simple_multiple_same_ops, test/test_fx.py::TestCSEPass::test_two_args, test/test_fx.py::TestCSEPass::test_two_args_default, test/test_fx.py::TestDCE::test_dead_chain, test/test_fx.py::TestDCE::test_dead_getattr, test/test_fx.py::TestDCE::test_dead_placeholder, test/test_fx.py::TestDCE::test_dead_placeholder_with_user, test/test_fx.py::TestDCE::test_impure_custom, test/test_fx.py::TestDCE::test_impure_kwargs, test/test_fx.py::TestDCE::test_impure_nodes_args, test/test_fx.py::TestDCE::test_impure_random, test/test_fx.py::TestDCE::test_keep_collectives, test/test_fx.py::TestDCE::test_keep_collectives_no_overload, test/test_fx.py::TestDCE::test_keep_module_with_side_effects, test/test_fx.py::TestDCE::test_keep_setitem, test/test_fx.py::TestDCE::test_keep_torch_assert, test/test_fx.py::TestDCE::test_simple, test/test_fx.py::TestConstFold::test_check_inline_non_const, test/test_fx.py::TestConstFold::test_check_inline_non_const_mult_return, test/test_fx.py::TestConstFold::test_check_skip_folding_quant_dequant_pattern, test/test_fx.py::TestConstFold::test_const_fold_basic_one_attr_name_collision, test/test_fx.py::TestConstFold::test_const_fold_basic_one_attr_no_name_collision, test/test_fx.py::TestConstFold::test_const_fold_basic_placeholder_reordered, test/test_fx.py::TestConstFold::test_const_fold_basic_two_attr, test/test_fx.py::TestConstFold::test_const_fold_basic_two_attr_three_input, test/test_fx.py::TestConstFold::test_const_fold_has_inlined_call_module_node, test/test_fx.py::TestConstFold::test_const_fold_module_attr, test/test_fx.py::TestConstFold::test_const_fold_multi_const_folded_attrs, test/test_fx.py::TestConstFold::test_const_fold_noop, test/test_fx.py::TestConstFold::test_const_fold_submod_hierarchy, test/test_fx.py::TestConstFold::test_const_fold_tensor_meta, test/test_fx.py::TestConstFold::test_const_fold_unused_placeholder, test/test_fx.py::TestConstFold::test_dict_output, test/test_fx.py::TestConstFold::test_fold_module, test/test_fx.py::TestConstFold::test_retain_node_meta, test/test_fx.py::TestConstFold::test_three_outputs, test/test_fx.py::TestConstFold::test_two_outputs, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_dim_const, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_ndim_const, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_nelement_const, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_numel_const, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_shape_const, test/test_fx.py::TestConstParamShapeInControlFlow::test_param_size_const, test/test_fx.py::AnnotationsTest::test_annotate, test/test_fx.py::AnnotationsTest::test_annotations, test/test_fx.py::AnnotationsTest::test_broadcasting1, test/test_fx.py::AnnotationsTest::test_broadcasting2, test/test_fx.py::AnnotationsTest::test_broadcasting3, test/test_fx.py::AnnotationsTest::test_consistency, test/test_fx.py::AnnotationsTest::test_precision, test/test_fx.py::TypeCheckerTest::test_flatten_fully_static, test/test_fx.py::TypeCheckerTest::test_resnet50, test/test_fx.py::TypeCheckerTest::test_symbolic_add_with_broadcast, test/test_fx.py::TypeCheckerTest::test_symbolic_add_with_broadcast_2, test/test_fx.py::TypeCheckerTest::test_type_check_add_false, test/test_fx.py::TypeCheckerTest::test_type_check_add_true, test/test_fx.py::TypeCheckerTest::test_type_check_add_with_broadcast, test/test_fx.py::TypeCheckerTest::test_type_check_add_with_scalar, test/test_fx.py::TypeCheckerTest::test_type_check_batch_norm_2D, test/test_fx.py::TypeCheckerTest::test_type_check_batch_norm_2D_broadcast, test/test_fx.py::TypeCheckerTest::test_type_check_batch_norm_2D_false, test/test_fx.py::TypeCheckerTest::test_type_check_batch_norm_symbolic, test/test_fx.py::TypeCheckerTest::test_type_check_conv2D, test/test_fx.py::TypeCheckerTest::test_type_check_conv2D_2, test/test_fx.py::TypeCheckerTest::test_type_check_conv2D_2_fully_static, test/test_fx.py::TypeCheckerTest::test_type_check_conv2D_maxpool2d_flatten, test/test_fx.py::TypeCheckerTest::test_type_check_conv2D_types, test/test_fx.py::TypeCheckerTest::test_type_check_flatten, test/test_fx.py::TypeCheckerTest::test_type_check_flatten3, test/test_fx.py::TypeCheckerTest::test_type_check_flatten_2, test/test_fx.py::TypeCheckerTest::test_type_check_reshape_dyn_false, test/test_fx.py::TypeCheckerTest::test_type_check_reshape_dyn_true, test/test_fx.py::TypeCheckerTest::test_type_check_reshape_dyn_true_param_false, test/test_fx.py::TypeCheckerTest::test_type_check_reshape_false, test/test_fx.py::TypeCheckerTest::test_type_check_reshape_true, test/test_fx.py::TypeCheckerTest::test_type_check_symbolic_inferenceconv2D_maxpool2d_flatten, test/test_fx.py::TypeCheckerTest::test_type_check_transpose_False, test/test_fx.py::TypeCheckerTest::test_type_check_transpose_true, test/test_fx.py::TypeCheckerTest::test_type_maxpool2d_fully_static, test/test_fx.py::TypeCheckerTest::test_type_typechecl_maxpool2d_3dinput, test/test_fx.py::TypeCheckerTest::test_typecheck_basicblock, test/test_fx.py::TestMatcher::test_matcher_with_name_node_map_function, test/test_fx.py::TestMatcher::test_matcher_with_name_node_map_module, test/test_fx.py::TestMatcher::test_split_to_graph_and_name_node_map, test/test_fx.py::TestMatcher::test_subgraph_matcher_ignore_literals, test/test_fx.py::TestMatcher::test_subgraph_matcher_with_attributes, test/test_fx.py::TestMatcher::test_subgraph_matcher_with_list, test/test_fx.py::TestMatcher::test_subgraph_matcher_with_list_bad, test/test_fx.py::TestMatcher::test_variatic_arg_matching, test/test_fx.py::TestPassManager::test_pass_manager, test/test_fx.py::TestPassManager::test_pass_manager_bad_checks, test/test_fx.py::TestPassManager::test_pass_manager_checks, test/test_fx.py::TestPassManager::test_pass_manager_error, test/test_fx.py::TestPassManager::test_this_before_that_pass_constraint, test/test_fx.py::TestPassManager::test_topological_sort, test/test_fx.py::TestSourceMatcher::test_legalize_slice, test/test_fx.py::TestSourceMatcher::test_module_partitioner_conv_relu_maxpool, test/test_fx.py::TestSourceMatcher::test_module_partitioner_conv_relu_maxpool_torch_fn_export_strict_False, test/test_fx.py::TestSourceMatcher::test_module_partitioner_conv_relu_maxpool_torch_fn_export_strict_True, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_conv_relu_conv, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_conv_relu_conv_torch_fn_export_strict_False, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_conv_relu_conv_torch_fn_export_strict_True, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_linear_relu_linear, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_linear_relu_linear_torch_fn_export_strict_False, test/test_fx.py::TestSourceMatcher::test_module_partitioner_functional_linear_relu_linear_torch_fn_export_strict_True, test/test_fx.py::TestSourceMatcher::test_module_partitioner_linear_relu_linear, test/test_fx.py::TestSourceMatcher::test_module_partitioner_linear_relu_linear_torch_fn_export_strict_False, test/test_fx.py::TestSourceMatcher::test_module_partitioner_linear_relu_linear_torch_fn_export_strict_True, test/test_fx.py::TestSourceMatcher::test_module_partitioner_weight_tied_strict_False, test/test_fx.py::TestSourceMatcher::test_module_partitioner_weight_tied_strict_True, test/test_fx.py::TestSubgraphRewriter::test_matching_pattern_with_list_type_arg, test/test_fx.py::TestSubgraphRewriter::test_matching_variable_arguments, test/test_fx.py::TestSubgraphRewriter::test_replace_pattern_with_callback, test/test_fx.py::TestSubgraphRewriter::test_replace_pattern_with_filters, test/test_fx.py::TestSubgraphRewriter::test_replaced_nodes, test/test_fx.py::TestSubgraphRewriter::test_replacement_with_attrs, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_annotations_int, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_call_method, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_correct_output_replacement, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_graph_argument_order, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_internal_pattern_nodes_cannot_have_users_that_are_not_matched, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_local_revert, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_multiple_pattern_match, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_nodes_with_kwargs, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_pattern_is_entire_graph, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_pattern_output_pattern_node_can_have_users_that_are_not_matched, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_placeholder_matching, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_preserves_logic, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_replace_consecutive_submodules, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_replace_with_duplicated_outputs, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_replace_with_multiple_outputs, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_replaces_referenced_submodules, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_single_pattern_match, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_traced_as_callable, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_with_oneliner_pattern, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_with_overlapping_matches, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_with_trivial_replacement, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_with_unused_args, test/test_fx.py::TestSubgraphRewriter::test_subgraph_rewriter_with_unused_results, test/test_fx.py::TestFX::test_all_input_nodes, test/test_fx.py::TestFX::test_annotation_with_future, test/test_fx.py::TestFX::test_annotations_empty_tuple, test/test_fx.py::TestFX::test_annotations_with_forward_references, test/test_fx.py::TestFX::test_annotations_with_no_forward_references, test/test_fx.py::TestFX::test_annotations_with_non_torch_reference_and_internal_forward_references, test/test_fx.py::TestFX::test_annotations_with_non_torch_reference_and_no_internal_forward_references, test/test_fx.py::TestFX::test_args_kwargs, test/test_fx.py::TestFX::test_args_kwargs_no_self, test/test_fx.py::TestFX::test_ast_rewriter_reassigns_submodules, test/test_fx.py::TestFX::test_ast_rewriter_rewrites_assert, test/test_fx.py::TestFX::test_ast_rewriter_rewrites_assert_with_message, test/test_fx.py::TestFX::test_ast_rewriter_wrap, test/test_fx.py::TestFX::test_ast_rewriter_wrap_fn_directly, test/test_fx.py::TestFX::test_ast_rewriter_wrap_with_submodule, test/test_fx.py::TestFX::test_ast_rewriter_wrapped_via_decorator, test/test_fx.py::TestFX::test_ast_rewriter_wrapped_via_decorator_and_transformed, test/test_fx.py::TestFX::test_autowrap_functions, test/test_fx.py::TestFX::test_concrete_arg_none_assert, test/test_fx.py::TestFX::test_construct_root_dict, test/test_fx.py::TestFX::test_control_flow_tracing, test/test_fx.py::TestFX::test_copy_it, test/test_fx.py::TestFX::test_copy_no_remap, test/test_fx.py::TestFX::test_ctx_mgr, test/test_fx.py::TestFX::test_custom_codegen, test/test_fx.py::TestFX::test_custom_codegen_with_transformer, test/test_fx.py::TestFX::test_custom_import, test/test_fx.py::TestFX::test_custom_proxy_dynamic_value, test/test_fx.py::TestFX::test_custom_proxy_input_dependent_control_flow, test/test_fx.py::TestFX::test_custom_proxy_type, test/test_fx.py::TestFX::test_custom_proxy_type_literal, test/test_fx.py::TestFX::test_custom_traceback_not_raised_when_exception_source_is_submodule, test/test_fx.py::TestFX::test_custom_traceback_raised_when_exception_source_is_graphmodule, test/test_fx.py::TestFX::test_deepcopy_graph_with_tracer_cls, test/test_fx.py::TestFX::test_deepcopy_graphmodule, test/test_fx.py::TestFX::test_deepcopy_graphmodule_with_transform, test/test_fx.py::TestFX::test_deepcopy_no_recursion, test/test_fx.py::TestFX::test_deepcopy_recursion_depth, test/test_fx.py::TestFX::test_deepcopy_tracer, test/test_fx.py::TestFX::test_deepcopy_with_submods_params, test/test_fx.py::TestFX::test_delete_unused_submodules_leaf, test/test_fx.py::TestFX::test_delete_unused_values, test/test_fx.py::TestFX::test_dict, test/test_fx.py::TestFX::test_direct_param_use, test/test_fx.py::TestFX::test_disallow_override, test/test_fx.py::TestFX::test_ellipsis, test/test_fx.py::TestFX::test_empty_graph_codegen, test/test_fx.py::TestFX::test_enum, test/test_fx.py::TestFX::test_erase_node_error, test/test_fx.py::TestFX::test_example_shape_prop, test/test_fx.py::TestFX::test_find_uses, test/test_fx.py::TestFX::test_fn_type_annotation_empty, test/test_fx.py::TestFX::test_fn_type_annotations, test/test_fx.py::TestFX::test_fx_and_or, test/test_fx.py::TestFX::test_fx_create_arg, test/test_fx.py::TestFX::test_fx_shifts, test/test_fx.py::TestFX::test_fx_stateless, test/test_fx.py::TestFX::test_get_torch_func_signature, test/test_fx.py::TestFX::test_getitem, test/test_fx.py::TestFX::test_getitem_subproc, test/test_fx.py::TestFX::test_graph_edit_with_proxy, test/test_fx.py::TestFX::test_graph_fns, test/test_fx.py::TestFX::test_graph_module, test/test_fx.py::TestFX::test_graph_module_init_buffer_param_copied_dict_init, test/test_fx.py::TestFX::test_graph_module_init_buffer_param_copied_mod_init, test/test_fx.py::TestFX::test_graph_module_replicate_for_dp, test/test_fx.py::TestFX::test_graph_unique_names, test/test_fx.py::TestFX::test_graph_unique_names_manual, test/test_fx.py::TestFX::test_immutable_dict_pytree_ops, test/test_fx.py::TestFX::test_immutable_list_pytree_ops, test/test_fx.py::TestFX::test_imul_code_print, test/test_fx.py::TestFX::test_inf_nan, test/test_fx.py::TestFX::test_inf_nan_kwds, test/test_fx.py::TestFX::test_informative_co_filename, test/test_fx.py::TestFX::test_inline_graph, test/test_fx.py::TestFX::test_insert_arg, test/test_fx.py::TestFX::test_insertion_point, test/test_fx.py::TestFX::test_interpreter, test/test_fx.py::TestFX::test_interpreter_default_args, test/test_fx.py::TestFX::test_interpreter_gc_values, test/test_fx.py::TestFX::test_interpreter_noop_resnet18, test/test_fx.py::TestFX::test_interpreter_not_enough_args, test/test_fx.py::TestFX::test_interpreter_onthefly_swap, test/test_fx.py::TestFX::test_interpreter_other_graph, test/test_fx.py::TestFX::test_interpreter_partial_eval, test/test_fx.py::TestFX::test_interpreter_run_node_override, test/test_fx.py::TestFX::test_interpreter_star_args, test/test_fx.py::TestFX::test_interpreter_with_codegen, test/test_fx.py::TestFX::test_layout, test/test_fx.py::TestFX::test_leaf_module, test/test_fx.py::TestFX::test_lineno_map, test/test_fx.py::TestFX::test_matmul_tracing, test/test_fx.py::TestFX::test_metadata_on_ph, test/test_fx.py::TestFX::test_module_deepcopy_edit_nodes, test/test_fx.py::TestFX::test_move_before, test/test_fx.py::TestFX::test_multi_insert_point, test/test_fx.py::TestFX::test_multiple_default_args, test/test_fx.py::TestFX::test_named_tuple_inlined, test/test_fx.py::TestFX::test_namedtuple_return_qualname, test/test_fx.py::TestFX::test_namedtuple_return_trace, test/test_fx.py::TestFX::test_native_callable, test/test_fx.py::TestFX::test_nn_module_stack, test/test_fx.py::TestFX::test_no_mutation, test/test_fx.py::TestFX::test_node_tagging, test/test_fx.py::TestFX::test_nonetype_annotation, test/test_fx.py::TestFX::test_partial_trace, test/test_fx.py::TestFX::test_pickle_custom_import, test/test_fx.py::TestFX::test_pickle_graphmodule, test/test_fx.py::TestFX::test_pickle_nonetype_annotation, test/test_fx.py::TestFX::test_pickle_torch_custom_ops, test/test_fx.py::TestFX::test_prepend_self, test/test_fx.py::TestFX::test_pretty_print, test/test_fx.py::TestFX::test_pretty_print_graph, test/test_fx.py::TestFX::test_pretty_print_node, test/test_fx.py::TestFX::test_pretty_print_targets, test/test_fx.py::TestFX::test_print_graph, test/test_fx.py::TestFX::test_profiler_ranges_side_effect, test/test_fx.py::TestFX::test_proxy_deepcopy_with_tracer, test/test_fx.py::TestFX::test_proxy_deepcopy_without_tracer, test/test_fx.py::TestFX::test_pytree, test/test_fx.py::TestFX::test_pytree_concrete, test/test_fx.py::TestFX::test_reassign_args_kwargs_uses, test/test_fx.py::TestFX::test_regular_and_default_args, test/test_fx.py::TestFX::test_remove_uses, test/test_fx.py::TestFX::test_remove_uses_with_custom_filter, test/test_fx.py::TestFX::test_replace_input, test/test_fx.py::TestFX::test_replace_uses, test/test_fx.py::TestFX::test_reserved_getattr, test/test_fx.py::TestFX::test_return_tuple, test/test_fx.py::TestFX::test_return_type_exists, test/test_fx.py::TestFX::test_return_type_exists_pre_pep585, test/test_fx.py::TestFX::test_script_method_trace, test/test_fx.py::TestFX::test_script_tensor_constant, test/test_fx.py::TestFX::test_sequential, test/test_fx.py::TestFX::test_shape_prop_aggregate, test/test_fx.py::TestFX::test_shape_prop_layout, test/test_fx.py::TestFX::test_shape_prop_layout_3d, test/test_fx.py::TestFX::test_shape_prop_unbacked_sym, test/test_fx.py::TestFX::test_single_default_arg, test/test_fx.py::TestFX::test_snake_case, test/test_fx.py::TestFX::test_sqrt, test/test_fx.py::TestFX::test_stack_traces, test/test_fx.py::TestFX::test_stack_traces_with_transformer, test/test_fx.py::TestFX::test_string_literal_return, test/test_fx.py::TestFX::test_submodule_manipulation_API, test/test_fx.py::TestFX::test_symbolic_trace_assert, test/test_fx.py::TestFX::test_symbolic_trace_sequential, test/test_fx.py::TestFX::test_tensor_attribute, test/test_fx.py::TestFX::test_tensor_attribute_coalseced, test/test_fx.py::TestFX::test_tensor_constant, test/test_fx.py::TestFX::test_throw_out_variant, test/test_fx.py::TestFX::test_torch_custom_ops, test/test_fx.py::TestFX::test_torch_fx_getattr, test/test_fx.py::TestFX::test_torch_fx_len, test/test_fx.py::TestFX::test_torch_op_overloads, test/test_fx.py::TestFX::test_torchbind_class_attribute_in_fx, test/test_fx.py::TestFX::test_torchbind_class_attribute_in_fx_tensor_arg, test/test_fx.py::TestFX::test_trace_buffer_slice, test/test_fx.py::TestFX::test_trace_dict_int_keys, test/test_fx.py::TestFX::test_trace_dict_proxy_keys, test/test_fx.py::TestFX::test_trace_fn_constant, test/test_fx.py::TestFX::test_trace_function, test/test_fx.py::TestFX::test_trace_multiple_funcs, test/test_fx.py::TestFX::test_trace_return_dataclass, test/test_fx.py::TestFX::test_trace_return_dataclass_nested, test/test_fx.py::TestFX::test_trace_return_namedtuple, test/test_fx.py::TestFX::test_tracing_graphmodules_as_leaf_submodules, test/test_fx.py::TestFX::test_transformer_multi_outputs, test/test_fx.py::TestFX::test_transformer_noop, test/test_fx.py::TestFX::test_transformer_op_swap, test/test_fx.py::TestFX::test_transformer_preserves_nn_module_stack_for_get_attr, test/test_fx.py::TestFX::test_tuple_no_subscript, test/test_fx.py::TestFX::test_typename_print, test/test_fx.py::TestFX::test_typename_print_pre_pep585, test/test_fx.py::TestFX::test_unpack, test/test_fx.py::TestFX::test_unpack_dict_better_error, test/test_fx.py::TestFX::test_unpack_list_better_error, test/test_fx.py::TestFX::test_update_args_api, test/test_fx.py::TestFX::test_update_args_kwargs_yells_at_you, test/test_fx.py::TestFX::test_update_kwargs_api, test/test_fx.py::TestFX::test_user_friendly_call_provenance_with_function, test/test_fx.py::TestFX::test_user_friendly_call_provenance_with_module, test/test_fx.py::TestFX::test_varargs_concrete, test/test_fx.py::TestFX::test_wrap, test/test_fx.py::TestFX::test_wrap_decorated_function, test/test_fx.py::TestFX::test_wrap_fn_directly, test/test_fx.py::TestFX::test_wrap_with_submodule, test/test_fx.py::TestFX::test_wrapped_method, test/test_fx.py::TestFX::test_wrapped_retrace, test/test_fx.py::TestFX::test_wrapped_via_decorator, test/test_fx.py::TestFX::test_wrapped_via_decorator_and_transformed, test/test_fx.py::TestFX::test_wrong_target_type, test/test_fx.py::TestFX::test_wrong_topo, test/test_fx.py::TestFXAPIBackwardCompatibility::test_adding_side_effect_function, test/test_fx.py::TestFXAPIBackwardCompatibility::test_class_member_back_compat, test/test_fx.py::TestFXAPIBackwardCompatibility::test_function_back_compat, test/test_fx.py::TestFXAPIBackwardCompatibility::test_preserve_unused_attr_after_unpickle, test/test_fx.py::TestFXAPIBackwardCompatibility::test_public_api_surface, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_avg_pool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_avg_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_avg_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool1d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool2d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_adaptive_max_pool3d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_affine_grid, test/test_fx.py::TestFunctionalTracing::test_nn_functional_alpha_dropout, test/test_fx.py::TestFunctionalTracing::test_nn_functional_avg_pool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_avg_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_avg_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_batch_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_bilinear, test/test_fx.py::TestFunctionalTracing::test_nn_functional_binary_cross_entropy, test/test_fx.py::TestFunctionalTracing::test_nn_functional_binary_cross_entropy_with_logits, test/test_fx.py::TestFunctionalTracing::test_nn_functional_celu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_celu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_channel_shuffle, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv_tbc, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv_transpose1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv_transpose2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_conv_transpose3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_cosine_embedding_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_cosine_similarity, test/test_fx.py::TestFunctionalTracing::test_nn_functional_cross_entropy, test/test_fx.py::TestFunctionalTracing::test_nn_functional_ctc_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_dropout, test/test_fx.py::TestFunctionalTracing::test_nn_functional_dropout1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_dropout2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_dropout3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_elu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_elu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_embedding, test/test_fx.py::TestFunctionalTracing::test_nn_functional_embedding_bag, test/test_fx.py::TestFunctionalTracing::test_nn_functional_feature_alpha_dropout, test/test_fx.py::TestFunctionalTracing::test_nn_functional_fold, test/test_fx.py::TestFunctionalTracing::test_nn_functional_fractional_max_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_fractional_max_pool2d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_fractional_max_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_fractional_max_pool3d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_gaussian_nll_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_gelu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_glu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_grid_sample, test/test_fx.py::TestFunctionalTracing::test_nn_functional_group_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_gumbel_softmax, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hardshrink, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hardsigmoid, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hardswish, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hardtanh, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hardtanh_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_hinge_embedding_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_huber_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_instance_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_interpolate, test/test_fx.py::TestFunctionalTracing::test_nn_functional_kl_div, test/test_fx.py::TestFunctionalTracing::test_nn_functional_l1_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_layer_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_leaky_relu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_leaky_relu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_linear, test/test_fx.py::TestFunctionalTracing::test_nn_functional_local_response_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_log_softmax, test/test_fx.py::TestFunctionalTracing::test_nn_functional_logsigmoid, test/test_fx.py::TestFunctionalTracing::test_nn_functional_lp_pool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_lp_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_lp_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_margin_ranking_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool1d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool2d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_pool3d_with_indices, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_unpool1d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_unpool2d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_max_unpool3d, test/test_fx.py::TestFunctionalTracing::test_nn_functional_mish, test/test_fx.py::TestFunctionalTracing::test_nn_functional_mse_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_multi_head_attention_forward, test/test_fx.py::TestFunctionalTracing::test_nn_functional_multi_margin_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_multilabel_margin_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_multilabel_soft_margin_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_native_channel_shuffle, test/test_fx.py::TestFunctionalTracing::test_nn_functional_nll_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_normalize, test/test_fx.py::TestFunctionalTracing::test_nn_functional_one_hot, test/test_fx.py::TestFunctionalTracing::test_nn_functional_pad, test/test_fx.py::TestFunctionalTracing::test_nn_functional_pairwise_distance, test/test_fx.py::TestFunctionalTracing::test_nn_functional_pdist, test/test_fx.py::TestFunctionalTracing::test_nn_functional_pixel_shuffle, test/test_fx.py::TestFunctionalTracing::test_nn_functional_pixel_unshuffle, test/test_fx.py::TestFunctionalTracing::test_nn_functional_poisson_nll_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_prelu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_relu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_relu6, test/test_fx.py::TestFunctionalTracing::test_nn_functional_relu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_rms_norm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_rrelu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_rrelu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_scaled_dot_product_attention, test/test_fx.py::TestFunctionalTracing::test_nn_functional_scaled_mm, test/test_fx.py::TestFunctionalTracing::test_nn_functional_selu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_selu_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_silu, test/test_fx.py::TestFunctionalTracing::test_nn_functional_smooth_l1_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_soft_margin_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_softmax, test/test_fx.py::TestFunctionalTracing::test_nn_functional_softmin, test/test_fx.py::TestFunctionalTracing::test_nn_functional_softplus, test/test_fx.py::TestFunctionalTracing::test_nn_functional_softshrink, test/test_fx.py::TestFunctionalTracing::test_nn_functional_threshold, test/test_fx.py::TestFunctionalTracing::test_nn_functional_threshold_, test/test_fx.py::TestFunctionalTracing::test_nn_functional_triplet_margin_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_triplet_margin_with_distance_loss, test/test_fx.py::TestFunctionalTracing::test_nn_functional_unfold, test/test_fx.py::TestFunctionalTracing::test_nn_functional_upsample, test/test_fx.py::TestFunctionalTracing::test_nn_functional_upsample_bilinear, test/test_fx.py::TestFunctionalTracing::test_nn_functional_upsample_nearest, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_H_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_T_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___getitem___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___radd___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rdiv___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rmatmul___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rmod___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rmul___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rpow___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive___rsub___cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__batch_norm_with_update_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__chunk_cat_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__native_batch_norm_legit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__segment_reduce_lengths_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__segment_reduce_offsets_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__softmax_backward_data_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__unsafe_masked_index_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__unsafe_masked_index_put_accumulate_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive__upsample_bilinear2d_aa_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_abs_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_acos_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_acosh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_add_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addbmm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addcdiv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addcmul_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addmm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addmm_decomposed_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addmv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_addr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_alias_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_all_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_allclose_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_amax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_amin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_aminmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_angle_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_any_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_arange_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_argmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_argmin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_argsort_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_argwhere_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_as_strided_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_as_strided_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_as_strided_partial_views_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_as_strided_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_asin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_asinh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atan2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atan_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atanh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atleast_1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atleast_2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_atleast_3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_baddbmm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_bernoulli_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_bfloat16_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_block_diag_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_bmm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_bool_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_broadcast_shapes_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_broadcast_tensors_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_broadcast_to_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_bucketize_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_byte_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cartesian_prod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cat_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cauchy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cdist_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cdouble_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ceil_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cfloat_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_chalf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_char_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cholesky_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cholesky_inverse_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cholesky_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_chunk_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_clamp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_clamp_max_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_clamp_min_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_clone_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_column_stack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_combinations_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_complex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_conj_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_conj_physical_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_constant_pad_nd_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_contiguous_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_copysign_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_corrcoef_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cos_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cosh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_count_nonzero_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cov_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cross_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cummax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cummin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cumprod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cumsum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_cumulative_trapezoid_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_deg2rad_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diag_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diag_embed_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diagflat_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diagonal_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diagonal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diagonal_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_diff_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_digamma_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_dist_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_div_floor_rounding_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_div_no_rounding_mode_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_div_trunc_rounding_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_dot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_double_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_dsplit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_dstack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_einsum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_empty_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_empty_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_empty_permuted_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_empty_strided_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_eq_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_equal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_erf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_erfc_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_erfinv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_exp2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_exp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_expand_as_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_expand_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_expand_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_expm1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_exponential_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_eye_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_fft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_fft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_fftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_fftshift_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_hfft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_hfft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_hfftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ifft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ifft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ifftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ifftshift_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ihfft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ihfft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_ihfftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_irfft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_irfft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_irfftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_rfft2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_rfft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fft_rfftn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fill_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_flatten_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_flip_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fliplr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_flipud_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_float_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_float_power_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_floor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_floor_divide_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fmin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_fmod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_frac_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_frexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_full_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_full_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_gather_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ge_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_geometric_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_geqrf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_gradient_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_grid_sampler_2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_grid_sampler_3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_gt_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_half_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_hash_tensor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_heaviside_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_histc_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_hsplit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_hstack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_hypot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_i0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_igamma_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_igammac_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_add_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_fill_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_put_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_reduce_amax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_reduce_amin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_reduce_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_reduce_prod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_index_select_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_inner_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_int_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isclose_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isfinite_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isinf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isnan_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isneginf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isposinf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_isreal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_item_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_jiterator_2inputs_2outputs_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_jiterator_binary_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_jiterator_binary_return_by_ref_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_jiterator_unary_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_kron_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_kthvalue_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ldexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_le_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lerp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lgamma_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_cholesky_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_cholesky_ex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_cond_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_cross_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_det_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_diagonal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_eig_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_eigh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_eigvals_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_eigvalsh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_householder_product_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_inv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_inv_ex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_ldl_factor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_ldl_factor_ex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_ldl_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lstsq_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lstsq_grad_oriented_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lu_factor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lu_factor_ex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_lu_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_matrix_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_matrix_power_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_matrix_rank_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_matrix_rank_hermitian_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_multi_dot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_norm_subgradients_at_zero_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_pinv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_pinv_hermitian_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_pinv_singular_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_qr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_slogdet_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_solve_ex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_solve_triangular_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_svd_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_svdvals_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_tensorinv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_tensorsolve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_vander_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_vecdot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linalg_vector_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linspace_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_linspace_tensor_overload_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log10_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log1p_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log_normal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log_softmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_log_softmax_with_dtype_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logaddexp2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logaddexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logcumsumexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logdet_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logical_and_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logical_not_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logical_or_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logical_xor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logspace_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logspace_tensor_overload_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_logsumexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_long_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lt_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lu_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_lu_unpack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mH_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mT_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_amax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_amin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_argmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_argmin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_cumprod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_cumsum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_fill_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_log_softmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_logaddexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_logsumexp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_median_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_normalize_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_prod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_select_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_softmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_softmin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_std_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_sum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_masked_var_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_matmul_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_matrix_exp_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_max_binary_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_max_pool2d_with_indices_backward_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_max_reduction_no_dim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_max_reduction_with_dim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_maximum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_median_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_meshgrid_list_of_tensors_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_meshgrid_variadic_tensors_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_min_binary_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_min_reduction_no_dim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_min_reduction_with_dim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_minimum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mode_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_movedim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_msort_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mul_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_multinomial_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mv_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nan_to_num_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nanmean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nanmedian_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nanquantile_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nansum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_narrow_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_narrow_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_native_batch_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_native_dropout_backward_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_native_layer_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ne_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_neg_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_new_empty_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_new_empty_strided_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_new_full_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_new_ones_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_new_zeros_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nextafter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_alpha_dropout_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_avg_pool1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_avg_pool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_avg_pool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_batch_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_bilinear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_binary_cross_entropy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_celu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_channel_shuffle_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv_transpose1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv_transpose2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_conv_transpose3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_cosine_similarity_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_cross_entropy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_ctc_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_dropout2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_dropout3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_dropout_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_elu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_embedding_bag_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_embedding_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_fractional_max_pool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_gaussian_nll_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_gelu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_glu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_grid_sample_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_group_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_hardshrink_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_hardsigmoid_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_hardswish_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_hardtanh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_hinge_embedding_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_huber_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_instance_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_area_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_bicubic_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_bilinear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_linear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_nearest_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_interpolate_trilinear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_kl_div_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_l1_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_layer_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_leaky_relu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_linear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_local_response_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_logsigmoid_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_margin_ranking_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_pool1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_pool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_pool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool1d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool1d_grad_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool2d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool2d_grad_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool3d_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_max_unpool3d_grad_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_mish_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_mse_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_multi_head_attention_forward_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_multi_margin_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_multilabel_margin_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_nll_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_normalize_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pad_circular_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pad_constant_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pad_reflect_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pad_replicate_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pad_replicate_negative_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pairwise_distance_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pdist_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pixel_shuffle_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_pixel_unshuffle_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_poisson_nll_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_prelu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_relu6_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_relu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_rms_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_rrelu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_selu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_silu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_smooth_l1_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_soft_margin_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_softmin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_softmin_with_dtype_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_softplus_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_softshrink_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_softsign_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_tanhshrink_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_threshold_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_triplet_margin_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_unfold_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_upsample_bilinear_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nn_functional_upsample_nearest_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nonzero_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_nonzero_static_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_norm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_norm_fro_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_norm_inf_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_norm_nuc_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_normal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_normal_in_place_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_normal_number_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ones_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ones_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ormqr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_outer_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_pca_lowrank_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_permute_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_permute_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_pinverse_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polar_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polygamma_polygamma_n_0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polygamma_polygamma_n_1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polygamma_polygamma_n_2_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polygamma_polygamma_n_3_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_polygamma_polygamma_n_4_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_positive_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_pow_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_prod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_put_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_qr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_quantile_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_rad2deg_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_rand_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_randint_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_randint_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_randn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_randn_like_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_ravel_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_real_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_reciprocal_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_remainder_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_renorm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_repeat_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_repeat_interleave_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_reshape_as_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_reshape_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_resize__cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_resize_as__cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_resolve_conj_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_resolve_neg_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_roll_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_rot90_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_round_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_round_decimals_0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_round_decimals_3_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_round_decimals_neg_3_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_rsqrt_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_rsub_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scalar_tensor_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_add_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_reduce_amax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_reduce_amin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_reduce_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_reduce_prod_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_scatter_reduce_sum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_searchsorted_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_select_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_select_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sgn_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_short_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sigmoid_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sign_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_bartlett_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_blackman_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_cosine_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_exponential_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_gaussian_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_general_cosine_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_general_hamming_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_hamming_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_hann_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_kaiser_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signal_windows_nuttall_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_signbit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sin_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sinc_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sinh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_slice_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_slice_scatter_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_softmax_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_softmax_with_dtype_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sort_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sparse_mm_reduce_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sparse_sampled_addmm_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_airy_ai_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_bessel_j0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_bessel_j1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_bessel_y0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_bessel_y1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_chebyshev_polynomial_t_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_chebyshev_polynomial_u_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_chebyshev_polynomial_v_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_chebyshev_polynomial_w_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_entr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_erfcx_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_hermite_polynomial_h_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_hermite_polynomial_he_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_i0e_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_i1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_i1e_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_laguerre_polynomial_l_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_legendre_polynomial_p_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_log_ndtr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_modified_bessel_i0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_modified_bessel_i1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_modified_bessel_k0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_modified_bessel_k1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_ndtr_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_ndtri_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_scaled_modified_bessel_k0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_scaled_modified_bessel_k1_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_spherical_bessel_j0_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_xlog1py_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_special_zeta_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_split_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_split_list_args_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_split_with_sizes_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_split_with_sizes_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sqrt_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_square_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_squeeze_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_squeeze_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_squeeze_multiple_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_stack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_std_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_std_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_std_mean_unbiased_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_std_unbiased_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_stft_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sub_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sum_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_sum_to_size_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_svd_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_svd_lowrank_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_t_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_t_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_take_along_dim_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_take_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tan_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tanh_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tensor_split_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tensordot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tile_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_to_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_to_sparse_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_topk_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_trace_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_transpose_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_transpose_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_trapezoid_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_trapz_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_triangular_solve_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_tril_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_triu_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_true_divide_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_trunc_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unbind_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unbind_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unflatten_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unfold_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unfold_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_uniform_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unique_consecutive_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unique_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unsafe_chunk_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unsafe_split_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unsqueeze_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_unsqueeze_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_var_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_var_mean_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_var_mean_unbiased_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_var_unbiased_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_vdot_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_view_as_complex_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_view_as_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_view_copy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_view_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_vsplit_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_vstack_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_where_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_xlogy_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_zero__cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_zeros_cuda_float32, test/test_fx.py::TestOperatorSignaturesCUDA::test_get_torch_func_signature_exhaustive_zeros_like_cuda_float32, test/test_fx.py::TestVisionTracing::test_torchvision_models_alexnet, test/test_fx.py::TestVisionTracing::test_torchvision_models_convnext_base, test/test_fx.py::TestVisionTracing::test_torchvision_models_convnext_large, test/test_fx.py::TestVisionTracing::test_torchvision_models_convnext_small, test/test_fx.py::TestVisionTracing::test_torchvision_models_convnext_tiny, test/test_fx.py::TestVisionTracing::test_torchvision_models_densenet121, test/test_fx.py::TestVisionTracing::test_torchvision_models_densenet161, test/test_fx.py::TestVisionTracing::test_torchvision_models_densenet169, test/test_fx.py::TestVisionTracing::test_torchvision_models_densenet201, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_fasterrcnn_mobilenet_v3_large_320_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_fasterrcnn_mobilenet_v3_large_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_fasterrcnn_resnet50_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_fasterrcnn_resnet50_fpn_v2, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_fcos_resnet50_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_keypointrcnn_resnet50_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_maskrcnn_resnet50_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_maskrcnn_resnet50_fpn_v2, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_retinanet_resnet50_fpn, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_retinanet_resnet50_fpn_v2, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_ssd300_vgg16, test/test_fx.py::TestVisionTracing::test_torchvision_models_detection_ssdlite320_mobilenet_v3_large, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b0, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b1, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b2, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b3, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b4, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b5, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b6, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_b7, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_v2_l, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_v2_m, test/test_fx.py::TestVisionTracing::test_torchvision_models_efficientnet_v2_s, test/test_fx.py::TestVisionTracing::test_torchvision_models_googlenet, test/test_fx.py::TestVisionTracing::test_torchvision_models_inception_v3, test/test_fx.py::TestVisionTracing::test_torchvision_models_maxvit_t, test/test_fx.py::TestVisionTracing::test_torchvision_models_mnasnet0_5, test/test_fx.py::TestVisionTracing::test_torchvision_models_mnasnet0_75, test/test_fx.py::TestVisionTracing::test_torchvision_models_mnasnet1_0, test/test_fx.py::TestVisionTracing::test_torchvision_models_mnasnet1_3, test/test_fx.py::TestVisionTracing::test_torchvision_models_mobilenet_v2, test/test_fx.py::TestVisionTracing::test_torchvision_models_mobilenet_v3_large, test/test_fx.py::TestVisionTracing::test_torchvision_models_mobilenet_v3_small, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_16gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_1_6gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_32gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_3_2gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_400mf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_800mf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_x_8gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_128gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_16gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_1_6gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_32gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_3_2gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_400mf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_800mf, test/test_fx.py::TestVisionTracing::test_torchvision_models_regnet_y_8gf, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnet101, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnet152, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnet18, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnet34, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnet50, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnext101_32x8d, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnext101_64x4d, test/test_fx.py::TestVisionTracing::test_torchvision_models_resnext50_32x4d, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_deeplabv3_mobilenet_v3_large, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_deeplabv3_resnet101, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_deeplabv3_resnet50, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_fcn_resnet101, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_fcn_resnet50, test/test_fx.py::TestVisionTracing::test_torchvision_models_segmentation_lraspp_mobilenet_v3_large, test/test_fx.py::TestVisionTracing::test_torchvision_models_shufflenet_v2_x0_5, test/test_fx.py::TestVisionTracing::test_torchvision_models_shufflenet_v2_x1_0, test/test_fx.py::TestVisionTracing::test_torchvision_models_shufflenet_v2_x1_5, test/test_fx.py::TestVisionTracing::test_torchvision_models_shufflenet_v2_x2_0, test/test_fx.py::TestVisionTracing::test_torchvision_models_squeezenet1_0, test/test_fx.py::TestVisionTracing::test_torchvision_models_squeezenet1_1, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_b, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_s, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_t, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_v2_b, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_v2_s, test/test_fx.py::TestVisionTracing::test_torchvision_models_swin_v2_t, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg11, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg11_bn, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg13, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg13_bn, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg16, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg16_bn, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg19, test/test_fx.py::TestVisionTracing::test_torchvision_models_vgg19_bn, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_mc3_18, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_mvit_v1_b, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_mvit_v2_s, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_r2plus1d_18, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_r3d_18, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_s3d, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_swin3d_b, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_swin3d_s, test/test_fx.py::TestVisionTracing::test_torchvision_models_video_swin3d_t, test/test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16, test/test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_32, test/test_fx.py::TestVisionTracing::test_torchvision_models_vit_h_14, test/test_fx.py::TestVisionTracing::test_torchvision_models_vit_l_16, test/test_fx.py::TestVisionTracing::test_torchvision_models_vit_l_32, test/test_fx.py::TestVisionTracing::test_torchvision_models_wide_resnet101_2, test/test_fx.py::TestVisionTracing::test_torchvision_models_wide_resnet50_2 2025-10-10T01:41:05.0830086Z 2025-10-10T01:41:05.0830239Z Running test_ci_sanity_check_fail 1/1 ... [2025-10-10 01:41:04.990753] 2025-10-10T01:41:05.0830556Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:41:05.0831295Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ci_sanity_check_fail.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:41:04.991354] 2025-10-10T01:41:17.9543217Z Running test_mobile_optimizer 1/1 ... [2025-10-10 01:41:17.953457] 2025-10-10T01:41:17.9544082Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:41:17.9545947Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mobile_optimizer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:41:17.953795] 2025-10-10T01:41:23.9324287Z 2025-10-10T01:41:23.9329965Z test_mobile_optimizer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mobile_optimizer_1.1_8618f4a61fef19b6_.log 2025-10-10T01:41:23.9335100Z Running 7 items in this shard: test/test_mobile_optimizer.py::TestOptimizer::test_clone_module_with_class, test/test_mobile_optimizer.py::TestOptimizer::test_generate_mobile_module_lints, test/test_mobile_optimizer.py::TestOptimizer::test_hoist_conv_packed_params, test/test_mobile_optimizer.py::TestOptimizer::test_mobilenet_optimize_for_mobile, test/test_mobile_optimizer.py::TestOptimizer::test_optimize_for_mobile, test/test_mobile_optimizer.py::TestOptimizer::test_preserve_bundled_inputs_methods, test/test_mobile_optimizer.py::TestOptimizer::test_quantized_conv_no_asan_failures 2025-10-10T01:41:23.9339584Z 2025-10-10T01:41:23.9339926Z Running test_overrides 1/1 ... [2025-10-10 01:41:23.932612] 2025-10-10T01:41:23.9340772Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:41:23.9342845Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_overrides.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:41:23.933189] 2025-10-10T01:41:30.0644990Z 2025-10-10T01:41:30.0646347Z test_overrides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_overrides_1.1_63c263b7410605c7_.log 2025-10-10T01:41:30.1277268Z Running 1480 items in this shard: test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_H___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_T___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__backward_hooks___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__base___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__cdata___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__grad___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__grad_fn___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__post_accumulate_grad_hooks___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase__version___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_data___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_device___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_dtype___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_grad___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_grad_dtype___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_grad_fn___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_imag___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_cpu___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_cuda___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_ipu___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_leaf___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_maia___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_meta___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_mkldnn___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_mps___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_mtia___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_nested___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_quantized___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_sparse___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_sparse_csr___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_vulkan___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_xla___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_is_xpu___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_itemsize___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_layout___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_mH___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_mT___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_name___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_names___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_nbytes___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_ndim___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_output_nr___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_real___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_requires_grad___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_retains_grad___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_shape___get__, test/test_overrides.py::TestTorchFunctionOverride::test_TensorBase_volatile___get__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___add__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___and__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___array__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___array_wrap__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___bool__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___complex__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___contains__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___cuda_array_interface_____get__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___deepcopy__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___div__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___dlpack__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___dlpack_device__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___eq__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___float__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___floordiv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___format__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ge__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___getitem__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___gt__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___iadd__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___iand__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___idiv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ifloordiv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ilshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___imod__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___imul__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___index__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___int__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___invert__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ior__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___irshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___isub__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ixor__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___le__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___len__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___long__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___lshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___lt__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___matmul__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___mod__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___mul__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ne__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___nonzero__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___or__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___radd__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rand__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rdiv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___reduce_ex__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___repr__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___reversed__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rfloordiv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rlshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rmatmul__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rmod__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rmul__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___ror__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rpow__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rrshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rshift__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rsub__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___rxor__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___setitem__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___setstate__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___sub__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___truediv__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor___xor__, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__autocast_to_full_precision, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__autocast_to_reduced_precision, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__clear_non_serializable_cached_data, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__coalesced_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__dimI, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__dimV, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__is_view, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__nested_tensor_size, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__nested_tensor_storage_offsets, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__nested_tensor_strides, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__nnz, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__sparse_mask_projection, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__to_dense, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__update_names, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor__values, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_abs, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_abs_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_absolute, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_absolute_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_acos, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_acos_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_acosh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_acosh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_add, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_add_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addbmm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addbmm_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addcdiv, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addcdiv_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addcmul, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addcmul_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addmm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addmm_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addmv, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addmv_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_addr_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_adjoint, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_align_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_align_to, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_all, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_allclose, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_amax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_amin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_aminmax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_angle, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_any, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_apply_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arccos, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arccos_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arccosh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arccosh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arcsin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arcsin_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arcsinh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arcsinh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctan, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctan2, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctan2_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctan_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctanh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_arctanh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_argmax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_argmin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_argsort, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_argwhere, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_as_strided, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_as_strided_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_as_strided_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_asin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_asin_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_asinh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_asinh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atan, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atan2, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atan2_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atan_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atanh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_atanh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_backward, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_baddbmm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_baddbmm_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bernoulli, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bernoulli_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bfloat16, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bincount, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_and, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_and_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_left_shift, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_left_shift_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_not, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_not_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_or, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_or_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_right_shift, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_right_shift_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_xor, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bitwise_xor_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bmm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_bool, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_broadcast_to, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_byte, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cauchy_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ccol_indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cdouble, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ceil, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ceil_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cfloat, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_chalf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_char, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cholesky, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cholesky_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cholesky_solve, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_chunk, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp_max, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp_max_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp_min, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clamp_min_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clip, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clip_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_clone, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_coalesce, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_col_indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_conj, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_conj_physical, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_conj_physical_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_contiguous, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_copy_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_copysign, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_copysign_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_corrcoef, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cos, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cos_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cosh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cosh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_count_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cov, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cpu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cross, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_crow_indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cuda, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cummax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cummin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cumprod, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cumprod_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cumsum, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_cumsum_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_data_ptr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_deg2rad, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_deg2rad_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dense_dim, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dequantize, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_det, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_detach, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_detach_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diag, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diag_embed, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diagflat, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diagonal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diagonal_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_diff, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_digamma, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_digamma_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dim, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dim_order, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dist, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_div, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_div_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_divide, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_divide_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dot, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_double, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_dsplit, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_element_size, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_eq, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_eq_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_equal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erf_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erfc, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erfc_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erfinv, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_erfinv_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_exp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_exp2, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_exp2_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_exp_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_expand, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_expand_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_expm1, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_expm1_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_exponential_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fill_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fill_diagonal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fix, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fix_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_flatten, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_flip, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fliplr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_flipud, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_float, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_float_power, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_float_power_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_floor, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_floor_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_floor_divide, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_floor_divide_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fmax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fmin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fmod, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_fmod_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_frac, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_frac_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_frexp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_gather, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_gcd, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_gcd_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ge, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ge_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_geometric_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_geqrf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ger, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_get_device, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_greater, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_greater_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_greater_equal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_greater_equal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_gt, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_gt_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_half, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_hardshrink, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_has_names, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_hash_tensor, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_heaviside, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_heaviside_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_histc, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_histogram, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_hsplit, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_hypot, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_hypot_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_i0, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_i0_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_igamma, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_igamma_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_igammac, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_igammac_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_add, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_add_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_copy, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_copy_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_fill, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_fill_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_put, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_put_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_reduce_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_index_select, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_inner, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_int, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_int_repr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ipu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_coalesced, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_complex, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_conj, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_contiguous, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_distributed, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_floating_point, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_inference, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_neg, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_pinned, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_same_size, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_set_to, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_shared, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_is_signed, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isclose, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isfinite, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isinf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isnan, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isneginf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isposinf, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_isreal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_istft, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_item, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_kron, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_kthvalue, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lcm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lcm_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ldexp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ldexp_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_le, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_le_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lerp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lerp_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_less, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_less_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_less_equal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_less_equal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lgamma, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lgamma_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log10, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log10_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log1p, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log1p_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log2, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log2_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log_normal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_log_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logaddexp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logaddexp2, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logcumsumexp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logdet, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_and, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_and_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_not, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_not_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_or, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_or_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_xor, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logical_xor_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logit, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logit_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_logsumexp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_long, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lt, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lt_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_lu_solve, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_map2_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_map_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_masked_fill, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_masked_fill_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_masked_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_masked_scatter_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_masked_select, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_matmul, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_matrix_exp, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_matrix_power, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_max, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_maximum, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mean, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_median, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_min, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_minimum, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mode, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_module_load, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_moveaxis, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_movedim, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_msort, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mtia, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mul, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mul_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_multinomial, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_multiply, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_multiply_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mv, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mvlgamma, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_mvlgamma_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nan_to_num, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nan_to_num_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nanmean, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nanmedian, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nanquantile, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nansum, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_narrow, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_narrow_copy, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ndimension, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ne, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ne_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_neg, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_neg_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_negative, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_negative_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nelement, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nextafter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nextafter_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_nonzero_static, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_norm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_normal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_not_equal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_not_equal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_numel, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_numpy, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_orgqr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ormqr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_outer, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_permute, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_pin_memory, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_pinverse, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_polygamma, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_polygamma_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_positive, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_pow, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_pow_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_prelu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_prod, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_put, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_put_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_q_per_channel_axis, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_q_per_channel_scales, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_q_per_channel_zero_points, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_q_scale, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_q_zero_point, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_qr, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_qscheme, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_quantile, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rad2deg, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rad2deg_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_random_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_ravel, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_reciprocal, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_reciprocal_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_record_stream, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_refine_names, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_register_hook, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_register_post_accumulate_grad_hook, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_relu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_relu_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_remainder, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_remainder_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rename, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rename_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_renorm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_renorm_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_repeat, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_repeat_interleave, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_requires_grad_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_reshape, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_reshape_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resize, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resize_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resize_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resize_as_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resize_as_sparse_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resolve_conj, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_resolve_neg, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_retain_grad, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_roll, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rot90, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_round, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_round_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_row_indices, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rsqrt, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_rsqrt_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter_add, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter_add_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_scatter_reduce_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_select, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_select_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_set_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sgn, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sgn_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_share_memory_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_short, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sigmoid, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sigmoid_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sign, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sign_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_signbit, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sin, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sin_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sinc, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sinc_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sinh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sinh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_size, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_slice_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_slice_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_slogdet, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_smm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sort, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sparse_dim, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sparse_mask, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sparse_resize_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sparse_resize_and_clear_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_split, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_split_with_sizes, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sqrt, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sqrt_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_square, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_square_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_squeeze, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_squeeze_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sspaddmm, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_std, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_stft, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_storage, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_storage_offset, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_storage_type, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sub, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sub_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_subtract, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_subtract_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sum, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_sum_to_size, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_svd, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_swapaxes, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_swapaxes_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_swapdims, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_swapdims_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_t, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_t_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_take, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_take_along_dim, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tan, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tan_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tanh, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tanh_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tensor_split, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tile, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_to, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_to_dense, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_to_mkldnn, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_to_sparse, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tolist, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_topk, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_trace, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_transpose, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_transpose_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_triangular_solve, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tril, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_tril_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_triu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_triu_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_true_divide, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_true_divide_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_trunc, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_trunc_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_type, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_type_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unbind, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unfold, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_uniform_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unique, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unique_consecutive, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unsafe_chunk, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unsafe_split, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unsafe_split_with_sizes, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unsqueeze, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_unsqueeze_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_untyped_storage, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_values, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_var, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_vdot, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_view, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_view_as, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_vsplit, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_where, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_xlogy, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_xlogy_, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_xpu, test/test_overrides.py::TestTorchFunctionOverride::test_Tensor_zero_, test/test_overrides.py::TestTorchFunctionOverride::test_base, test/test_overrides.py::TestTorchFunctionOverride::test_dtype_override, test/test_overrides.py::TestTorchFunctionOverride::test_grad, test/test_overrides.py::TestTorchFunctionOverride::test_has_torch_function_non_sequence, test/test_overrides.py::TestTorchFunctionOverride::test_mean_semantics, test/test_overrides.py::TestTorchFunctionOverride::test_mm_semantics, test/test_overrides.py::TestTorchFunctionOverride::test_pow_rpow, test/test_overrides.py::TestTorchFunctionOverride::test_precedence_semantics, test/test_overrides.py::TestTorchFunctionOverride::test_tensor_subclass_propagation, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_fft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_fft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_fftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_fftshift, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_hfft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_hfft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_hfftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ifft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ifft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ifftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ifftshift, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ihfft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ihfft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_ihfftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_irfft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_irfft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_irfftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_rfft, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_rfft2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__fft_fft_rfftn, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_cholesky, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_cholesky_ex, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_cond, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_cross, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_det, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_diagonal, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_eig, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_eigh, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_eigvals, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_eigvalsh, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_householder_product, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_inv, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_inv_ex, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_ldl_factor, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_ldl_factor_ex, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_ldl_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_lstsq, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_lu, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_lu_factor, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_lu_factor_ex, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_lu_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_matmul, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_matrix_exp, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_matrix_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_matrix_power, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_matrix_rank, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_multi_dot, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_pinv, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_qr, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_slogdet, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_solve_ex, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_solve_triangular, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_svd, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_svdvals, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_tensorinv, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_tensorsolve, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_vander, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_vecdot, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__linalg_linalg_vector_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_avg_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_avg_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_gelu, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_linear, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_log_sigmoid, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_one_hot, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_scaled_dot_product_attention, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_softplus, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__nn_softshrink, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_airy_ai, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_bessel_j0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_bessel_j1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_bessel_y0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_bessel_y1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_chebyshev_polynomial_t, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_chebyshev_polynomial_u, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_chebyshev_polynomial_v, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_chebyshev_polynomial_w, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_digamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_entr, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_erf, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_erfc, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_erfcx, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_erfinv, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_exp2, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_expit, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_expm1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_gammainc, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_gammaincc, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_gammaln, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_hermite_polynomial_h, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_hermite_polynomial_he, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_i0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_i0e, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_i1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_i1e, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_laguerre_polynomial_l, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_legendre_polynomial_p, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_log1p, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_log_ndtr, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_log_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_logit, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_logsumexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_modified_bessel_i0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_modified_bessel_i1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_modified_bessel_k0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_modified_bessel_k1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_multigammaln, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_ndtr, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_ndtri, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_polygamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_psi, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_round, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_scaled_modified_bessel_k0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_scaled_modified_bessel_k1, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_shifted_chebyshev_polynomial_t, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_shifted_chebyshev_polynomial_u, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_shifted_chebyshev_polynomial_v, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_shifted_chebyshev_polynomial_w, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_sinc, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_spherical_bessel_j0, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_xlog1py, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_xlogy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__C__special_special_zeta, test/test_overrides.py::TestTorchFunctionOverride::test_torch__assert_async, test/test_overrides.py::TestTorchFunctionOverride::test_torch__conj_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__functional_assert_async, test/test_overrides.py::TestTorchFunctionOverride::test_torch__fused_rms_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch__fw_primal_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__lobpcg_lobpcg, test/test_overrides.py::TestTorchFunctionOverride::test_torch__lowrank_pca_lowrank, test/test_overrides.py::TestTorchFunctionOverride::test_torch__lowrank_svd_lowrank, test/test_overrides.py::TestTorchFunctionOverride::test_torch__make_dual_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__native_batch_norm_legit, test/test_overrides.py::TestTorchFunctionOverride::test_torch__neg_view_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__reshape_alias_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__rowwise_prune, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sparse_broadcast_to_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_acos, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_asin, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_atan, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_cos, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_cosh, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_sin, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_sinh, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_sqrt, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_tan, test/test_overrides.py::TestTorchFunctionOverride::test_torch__sym_tanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch__values_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch__wrapped_linear_prepack, test/test_overrides.py::TestTorchFunctionOverride::test_torch__wrapped_quantized_linear_prepacked, test/test_overrides.py::TestTorchFunctionOverride::test_torch_abs, test/test_overrides.py::TestTorchFunctionOverride::test_torch_absolute, test/test_overrides.py::TestTorchFunctionOverride::test_torch_acos, test/test_overrides.py::TestTorchFunctionOverride::test_torch_acosh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_adaptive_avg_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_adaptive_max_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_add, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addbmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addcdiv, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addcmul, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addmv, test/test_overrides.py::TestTorchFunctionOverride::test_torch_addr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_adjoint, test/test_overrides.py::TestTorchFunctionOverride::test_torch_affine_grid_generator, test/test_overrides.py::TestTorchFunctionOverride::test_torch_alias_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_all, test/test_overrides.py::TestTorchFunctionOverride::test_torch_allclose, test/test_overrides.py::TestTorchFunctionOverride::test_torch_alpha_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_amax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_amin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_aminmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_angle, test/test_overrides.py::TestTorchFunctionOverride::test_torch_any, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arccos, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arccosh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arcsin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arcsinh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arctan, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arctan2, test/test_overrides.py::TestTorchFunctionOverride::test_torch_arctanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_argmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_argmin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_argsort, test/test_overrides.py::TestTorchFunctionOverride::test_torch_argwhere, test/test_overrides.py::TestTorchFunctionOverride::test_torch_as_strided_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_as_strided_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_asin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_asinh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_atan, test/test_overrides.py::TestTorchFunctionOverride::test_torch_atan2, test/test_overrides.py::TestTorchFunctionOverride::test_torch_atanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_avg_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_baddbmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_backward_elemt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_backward_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_elemt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_gather_stats, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_gather_stats_with_counts, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_stats, test/test_overrides.py::TestTorchFunctionOverride::test_torch_batch_norm_update_stats, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bernoulli, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bilinear, test/test_overrides.py::TestTorchFunctionOverride::test_torch_binary_cross_entropy_with_logits, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bincount, test/test_overrides.py::TestTorchFunctionOverride::test_torch_binomial, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_and, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_left_shift, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_not, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_or, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_right_shift, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bitwise_xor, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_broadcast_to, test/test_overrides.py::TestTorchFunctionOverride::test_torch_bucketize, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cat, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ccol_indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ceil, test/test_overrides.py::TestTorchFunctionOverride::test_torch_celu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_channel_shuffle, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cholesky, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cholesky_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cholesky_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch_choose_qparams_optimized, test/test_overrides.py::TestTorchFunctionOverride::test_torch_chunk, test/test_overrides.py::TestTorchFunctionOverride::test_torch_clamp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_clamp_max, test/test_overrides.py::TestTorchFunctionOverride::test_torch_clamp_min, test/test_overrides.py::TestTorchFunctionOverride::test_torch_clip, test/test_overrides.py::TestTorchFunctionOverride::test_torch_clone, test/test_overrides.py::TestTorchFunctionOverride::test_torch_col_indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_column_stack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_combinations, test/test_overrides.py::TestTorchFunctionOverride::test_torch_complex, test/test_overrides.py::TestTorchFunctionOverride::test_torch_concat, test/test_overrides.py::TestTorchFunctionOverride::test_torch_concatenate, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conj, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conj_physical, test/test_overrides.py::TestTorchFunctionOverride::test_torch_constant_pad_nd, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv_tbc, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv_transpose1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv_transpose2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_conv_transpose3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_convolution, test/test_overrides.py::TestTorchFunctionOverride::test_torch_copysign, test/test_overrides.py::TestTorchFunctionOverride::test_torch_corrcoef, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cos, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cosh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cosine_embedding_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cosine_similarity, test/test_overrides.py::TestTorchFunctionOverride::test_torch_count_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cov, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cross, test/test_overrides.py::TestTorchFunctionOverride::test_torch_crow_indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ctc_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cummax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cummin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cumprod, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cumsum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_cumulative_trapezoid, test/test_overrides.py::TestTorchFunctionOverride::test_torch_deg2rad, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dequantize, test/test_overrides.py::TestTorchFunctionOverride::test_torch_det, test/test_overrides.py::TestTorchFunctionOverride::test_torch_detach, test/test_overrides.py::TestTorchFunctionOverride::test_torch_detach_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diag, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diag_embed, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diagflat, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diagonal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diagonal_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diagonal_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_diff, test/test_overrides.py::TestTorchFunctionOverride::test_torch_digamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dist, test/test_overrides.py::TestTorchFunctionOverride::test_torch_div, test/test_overrides.py::TestTorchFunctionOverride::test_torch_divide, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dot, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dsmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dsplit, test/test_overrides.py::TestTorchFunctionOverride::test_torch_dstack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_embedding, test/test_overrides.py::TestTorchFunctionOverride::test_torch_embedding_bag, test/test_overrides.py::TestTorchFunctionOverride::test_torch_empty_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_eq, test/test_overrides.py::TestTorchFunctionOverride::test_torch_equal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_erf, test/test_overrides.py::TestTorchFunctionOverride::test_torch_erfc, test/test_overrides.py::TestTorchFunctionOverride::test_torch_erfinv, test/test_overrides.py::TestTorchFunctionOverride::test_torch_exp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_exp2, test/test_overrides.py::TestTorchFunctionOverride::test_torch_expand_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_expm1, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fake_quantize_per_channel_affine, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fake_quantize_per_tensor_affine, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_linear_fp16_weight, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_linear_fp16_weight_fp32_activation, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_linear_int8_weight, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_linear_int8_weight_fp32_activation, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_linear_quantize_weight, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_pack_gemm_matrix_fp16, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fbgemm_pack_quantized_matrix, test/test_overrides.py::TestTorchFunctionOverride::test_torch_feature_alpha_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_feature_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fix, test/test_overrides.py::TestTorchFunctionOverride::test_torch_flatten, test/test_overrides.py::TestTorchFunctionOverride::test_torch_flip, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fliplr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_flipud, test/test_overrides.py::TestTorchFunctionOverride::test_torch_float_power, test/test_overrides.py::TestTorchFunctionOverride::test_torch_floor, test/test_overrides.py::TestTorchFunctionOverride::test_torch_floor_divide, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fmin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fmod, test/test_overrides.py::TestTorchFunctionOverride::test_torch_frac, test/test_overrides.py::TestTorchFunctionOverride::test_torch_frexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_frobenius_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_full_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_empty_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_in_float_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_in_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_in_scalar_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_mixed_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_nested_tuple_getitem, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_not_first_in_list, test/test_overrides.py::TestTorchFunctionOverride::test_torch_function_precedence_in_lists, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_atleast_1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_atleast_2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_atleast_3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_block_diag, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_broadcast_tensors, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_cartesian_prod, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_cdist, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_chain_matmul, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_einsum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_lu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_meshgrid, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_split, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_stft, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_tensordot, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_unique, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_unique_consecutive, test/test_overrides.py::TestTorchFunctionOverride::test_torch_functional_unravel_index, test/test_overrides.py::TestTorchFunctionOverride::test_torch_fused_moving_avg_obs_fake_quant, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gather, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gcd, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ge, test/test_overrides.py::TestTorchFunctionOverride::test_torch_geqrf, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ger, test/test_overrides.py::TestTorchFunctionOverride::test_torch_get_device, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gradient, test/test_overrides.py::TestTorchFunctionOverride::test_torch_greater, test/test_overrides.py::TestTorchFunctionOverride::test_torch_greater_equal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_grid_sampler, test/test_overrides.py::TestTorchFunctionOverride::test_torch_grid_sampler_2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_grid_sampler_3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_group_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gru, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gru_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_gt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hardshrink, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hash_tensor, test/test_overrides.py::TestTorchFunctionOverride::test_torch_heaviside, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hinge_embedding_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_histc, test/test_overrides.py::TestTorchFunctionOverride::test_torch_histogram, test/test_overrides.py::TestTorchFunctionOverride::test_torch_histogramdd, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hsmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hsplit, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hstack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_hypot, test/test_overrides.py::TestTorchFunctionOverride::test_torch_i0, test/test_overrides.py::TestTorchFunctionOverride::test_torch_igamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch_igammac, test/test_overrides.py::TestTorchFunctionOverride::test_torch_imag, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_add, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_fill, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_put, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_torch_index_select, test/test_overrides.py::TestTorchFunctionOverride::test_torch_indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_inner, test/test_overrides.py::TestTorchFunctionOverride::test_torch_instance_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_int_repr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_complex, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_conj, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_distributed, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_floating_point, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_inference, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_neg, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_same_size, test/test_overrides.py::TestTorchFunctionOverride::test_torch_is_signed, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isclose, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isfinite, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isinf, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isnan, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isneginf, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isposinf, test/test_overrides.py::TestTorchFunctionOverride::test_torch_isreal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_istft, test/test_overrides.py::TestTorchFunctionOverride::test_torch_kl_div, test/test_overrides.py::TestTorchFunctionOverride::test_torch_kron, test/test_overrides.py::TestTorchFunctionOverride::test_torch_kthvalue, test/test_overrides.py::TestTorchFunctionOverride::test_torch_layer_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lcm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ldexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_le, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lerp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_less, test/test_overrides.py::TestTorchFunctionOverride::test_torch_less_equal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lgamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch_log, test/test_overrides.py::TestTorchFunctionOverride::test_torch_log10, test/test_overrides.py::TestTorchFunctionOverride::test_torch_log1p, test/test_overrides.py::TestTorchFunctionOverride::test_torch_log2, test/test_overrides.py::TestTorchFunctionOverride::test_torch_log_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logaddexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logaddexp2, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logcumsumexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logdet, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logical_and, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logical_not, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logical_or, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logical_xor, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logit, test/test_overrides.py::TestTorchFunctionOverride::test_torch_logsumexp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lstm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lstm_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lu_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch_lu_unpack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_margin_ranking_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_masked_fill, test/test_overrides.py::TestTorchFunctionOverride::test_torch_masked_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_masked_select, test/test_overrides.py::TestTorchFunctionOverride::test_torch_matmul, test/test_overrides.py::TestTorchFunctionOverride::test_torch_matrix_exp, test/test_overrides.py::TestTorchFunctionOverride::test_torch_matrix_power, test/test_overrides.py::TestTorchFunctionOverride::test_torch_max, test/test_overrides.py::TestTorchFunctionOverride::test_torch_max_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_max_pool1d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_max_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_max_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_maximum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_mean, test/test_overrides.py::TestTorchFunctionOverride::test_torch_median, test/test_overrides.py::TestTorchFunctionOverride::test_torch_min, test/test_overrides.py::TestTorchFunctionOverride::test_torch_minimum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_batch_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_convolution, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_convolution_add_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_convolution_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_convolution_transpose, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_depthwise_convolution, test/test_overrides.py::TestTorchFunctionOverride::test_torch_miopen_rnn, test/test_overrides.py::TestTorchFunctionOverride::test_torch_mode, test/test_overrides.py::TestTorchFunctionOverride::test_torch_moveaxis, test/test_overrides.py::TestTorchFunctionOverride::test_torch_movedim, test/test_overrides.py::TestTorchFunctionOverride::test_torch_msort, test/test_overrides.py::TestTorchFunctionOverride::test_torch_mul, test/test_overrides.py::TestTorchFunctionOverride::test_torch_multinomial, test/test_overrides.py::TestTorchFunctionOverride::test_torch_multiply, test/test_overrides.py::TestTorchFunctionOverride::test_torch_mv, test/test_overrides.py::TestTorchFunctionOverride::test_torch_mvlgamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nan_to_num, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nanmean, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nanmedian, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nanquantile, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nansum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_narrow, test/test_overrides.py::TestTorchFunctionOverride::test_torch_narrow_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_batch_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_channel_shuffle, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_group_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_layer_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_native_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ne, test/test_overrides.py::TestTorchFunctionOverride::test_torch_neg, test/test_overrides.py::TestTorchFunctionOverride::test_torch_negative, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nextafter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional__threshold, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_avg_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_avg_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool1d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool2d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_adaptive_max_pool3d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_affine_grid, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_alpha_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_batch_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_binary_cross_entropy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_binary_cross_entropy_with_logits, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_celu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_cosine_embedding_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_cross_entropy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_ctc_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_dropout1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_dropout2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_dropout3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_elu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_embedding, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_embedding_bag, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_feature_alpha_dropout, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_fold, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_fractional_max_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_fractional_max_pool2d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_fractional_max_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_fractional_max_pool3d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_gaussian_nll_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_glu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_grid_sample, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_group_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_gumbel_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_hardtanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_hinge_embedding_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_huber_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_instance_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_interpolate, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_kl_div, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_l1_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_layer_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_leaky_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_local_response_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_log_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_lp_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_lp_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_lp_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_margin_ranking_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool1d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool2d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_pool3d_with_indices, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_unpool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_unpool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_max_unpool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_mish, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_mse_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_multi_head_attention_forward, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_multi_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_multilabel_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_multilabel_soft_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_nll_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_normalize, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_pad, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_poisson_nll_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_relu6, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_rms_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_rrelu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_selu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_silu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_smooth_l1_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_soft_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_softmin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_softsign, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_tanhshrink, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_triplet_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_triplet_margin_with_distance_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_functional_unfold, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_init_constant_, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_init_kaiming_uniform_, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_init_normal_, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nn_init_uniform_, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nonzero, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nonzero_static, test/test_overrides.py::TestTorchFunctionOverride::test_torch_norm_except_dim, test/test_overrides.py::TestTorchFunctionOverride::test_torch_not_equal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_nuclear_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_numel, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ones_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_orgqr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ormqr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_outer, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pairwise_distance, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pdist, test/test_overrides.py::TestTorchFunctionOverride::test_torch_permute, test/test_overrides.py::TestTorchFunctionOverride::test_torch_permute_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pinverse, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pixel_shuffle, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pixel_unshuffle, test/test_overrides.py::TestTorchFunctionOverride::test_torch_poisson, test/test_overrides.py::TestTorchFunctionOverride::test_torch_poisson_nll_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_polar, test/test_overrides.py::TestTorchFunctionOverride::test_torch_polygamma, test/test_overrides.py::TestTorchFunctionOverride::test_torch_positive, test/test_overrides.py::TestTorchFunctionOverride::test_torch_pow, test/test_overrides.py::TestTorchFunctionOverride::test_torch_prelu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_prod, test/test_overrides.py::TestTorchFunctionOverride::test_torch_put, test/test_overrides.py::TestTorchFunctionOverride::test_torch_q_per_channel_axis, test/test_overrides.py::TestTorchFunctionOverride::test_torch_q_per_channel_scales, test/test_overrides.py::TestTorchFunctionOverride::test_torch_q_per_channel_zero_points, test/test_overrides.py::TestTorchFunctionOverride::test_torch_q_scale, test/test_overrides.py::TestTorchFunctionOverride::test_torch_q_zero_point, test/test_overrides.py::TestTorchFunctionOverride::test_torch_qr, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantile, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantize_per_channel, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantize_per_tensor, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantize_per_tensor_dynamic, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_batch_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_gru_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_lstm_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_max_pool1d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_max_pool2d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_max_pool3d, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_rnn_relu_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_quantized_rnn_tanh_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rad2deg, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rand_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_randint_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_randn_like, test/test_overrides.py::TestTorchFunctionOverride::test_torch_ravel, test/test_overrides.py::TestTorchFunctionOverride::test_torch_real, test/test_overrides.py::TestTorchFunctionOverride::test_torch_reciprocal, test/test_overrides.py::TestTorchFunctionOverride::test_torch_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_remainder, test/test_overrides.py::TestTorchFunctionOverride::test_torch_renorm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_repeat_interleave, test/test_overrides.py::TestTorchFunctionOverride::test_torch_reshape, test/test_overrides.py::TestTorchFunctionOverride::test_torch_resolve_conj, test/test_overrides.py::TestTorchFunctionOverride::test_torch_resolve_neg, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rms_norm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rnn_relu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rnn_relu_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rnn_tanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rnn_tanh_cell, test/test_overrides.py::TestTorchFunctionOverride::test_torch_roll, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rot90, test/test_overrides.py::TestTorchFunctionOverride::test_torch_round, test/test_overrides.py::TestTorchFunctionOverride::test_torch_row_indices_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_row_stack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rrelu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rsqrt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_rsub, test/test_overrides.py::TestTorchFunctionOverride::test_torch_saddmm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_scatter_add, test/test_overrides.py::TestTorchFunctionOverride::test_torch_scatter_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_torch_searchsorted, test/test_overrides.py::TestTorchFunctionOverride::test_torch_segment_reduce, test/test_overrides.py::TestTorchFunctionOverride::test_torch_select, test/test_overrides.py::TestTorchFunctionOverride::test_torch_select_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_select_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_selu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sgn, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sigmoid, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sign, test/test_overrides.py::TestTorchFunctionOverride::test_torch_signbit, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sin, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sinc, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sinh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_slice_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_slice_inverse, test/test_overrides.py::TestTorchFunctionOverride::test_torch_slice_scatter, test/test_overrides.py::TestTorchFunctionOverride::test_torch_slogdet, test/test_overrides.py::TestTorchFunctionOverride::test_torch_smm, test/test_overrides.py::TestTorchFunctionOverride::test_torch_softmax, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sort, test/test_overrides.py::TestTorchFunctionOverride::test_torch_split_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_split_with_sizes, test/test_overrides.py::TestTorchFunctionOverride::test_torch_split_with_sizes_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sqrt, test/test_overrides.py::TestTorchFunctionOverride::test_torch_square, test/test_overrides.py::TestTorchFunctionOverride::test_torch_squeeze, test/test_overrides.py::TestTorchFunctionOverride::test_torch_squeeze_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_stack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_std, test/test_overrides.py::TestTorchFunctionOverride::test_torch_std_mean, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sub, test/test_overrides.py::TestTorchFunctionOverride::test_torch_subtract, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_svd, test/test_overrides.py::TestTorchFunctionOverride::test_torch_swapaxes, test/test_overrides.py::TestTorchFunctionOverride::test_torch_swapdims, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_float, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_int, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_ite, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_max, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_min, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_not, test/test_overrides.py::TestTorchFunctionOverride::test_torch_sym_sum, test/test_overrides.py::TestTorchFunctionOverride::test_torch_t, test/test_overrides.py::TestTorchFunctionOverride::test_torch_t_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_take, test/test_overrides.py::TestTorchFunctionOverride::test_torch_take_along_dim, test/test_overrides.py::TestTorchFunctionOverride::test_torch_tan, test/test_overrides.py::TestTorchFunctionOverride::test_torch_tanh, test/test_overrides.py::TestTorchFunctionOverride::test_torch_tensor_split, test/test_overrides.py::TestTorchFunctionOverride::test_torch_threshold, test/test_overrides.py::TestTorchFunctionOverride::test_torch_tile, test/test_overrides.py::TestTorchFunctionOverride::test_torch_topk, test/test_overrides.py::TestTorchFunctionOverride::test_torch_trace, test/test_overrides.py::TestTorchFunctionOverride::test_torch_transpose, test/test_overrides.py::TestTorchFunctionOverride::test_torch_transpose_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_trapezoid, test/test_overrides.py::TestTorchFunctionOverride::test_torch_trapz, test/test_overrides.py::TestTorchFunctionOverride::test_torch_triangular_solve, test/test_overrides.py::TestTorchFunctionOverride::test_torch_tril, test/test_overrides.py::TestTorchFunctionOverride::test_torch_triplet_margin_loss, test/test_overrides.py::TestTorchFunctionOverride::test_torch_triu, test/test_overrides.py::TestTorchFunctionOverride::test_torch_true_divide, test/test_overrides.py::TestTorchFunctionOverride::test_torch_trunc, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unbind, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unbind_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unflatten, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unfold_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unsafe_chunk, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unsafe_split, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unsafe_split_with_sizes, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unsqueeze, test/test_overrides.py::TestTorchFunctionOverride::test_torch_unsqueeze_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_values_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_var, test/test_overrides.py::TestTorchFunctionOverride::test_torch_var_mean, test/test_overrides.py::TestTorchFunctionOverride::test_torch_vdot, test/test_overrides.py::TestTorchFunctionOverride::test_torch_view_as_complex, test/test_overrides.py::TestTorchFunctionOverride::test_torch_view_as_complex_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_view_as_real, test/test_overrides.py::TestTorchFunctionOverride::test_torch_view_as_real_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_view_copy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_vsplit, test/test_overrides.py::TestTorchFunctionOverride::test_torch_vstack, test/test_overrides.py::TestTorchFunctionOverride::test_torch_where, test/test_overrides.py::TestTorchFunctionOverride::test_torch_xlogy, test/test_overrides.py::TestTorchFunctionOverride::test_torch_zeros_like, test/test_overrides.py::TestTorchFunctionOverride::test_user_implementation_raises, test/test_overrides.py::TestEinsumOverride::test_wrapper, test/test_overrides.py::TestGradCheckOverride::test_gradcheck, test/test_overrides.py::TestNamedTuple::test_max, test/test_overrides.py::TestGradNewOnesOverride::test_newones, test/test_overrides.py::TestPickle::test_pickle, test/test_overrides.py::TestBroadcastAllOverride::test_broadcast_all, test/test_overrides.py::TestWrapTorchFunction::test_wrap_torch_function, test/test_overrides.py::TestIndexing::test_getitem, test/test_overrides.py::TestIndexing::test_getitem_subclass, test/test_overrides.py::TestIndexing::test_setitem, test/test_overrides.py::TestIndexing::test_setitem_subclass, test/test_overrides.py::TestIndexing::test_setitem_val, test/test_overrides.py::TestIterator::test_iterator, test/test_overrides.py::TestRNN::test_rnn, test/test_overrides.py::TestDisabledTorchFunction::test_parameter_does_not_prevent_dispatch, test/test_overrides.py::TestResolveName::test_resolve_name, test/test_overrides.py::TestTorchFunctionWarning::test_torch_function_standalone_class, test/test_overrides.py::TestTorchFunctionWarning::test_torch_function_tensor_subclass, test/test_overrides.py::TestDisabledUserWarnings::test_no_implicit_user_warning_for_deprecated_functions, test/test_overrides.py::TestTorchFunctionMode::test_all_same_mode, test/test_overrides.py::TestTorchFunctionMode::test_basic, test/test_overrides.py::TestTorchFunctionMode::test_custom_device_type, test/test_overrides.py::TestTorchFunctionMode::test_device_context_semantics, test/test_overrides.py::TestTorchFunctionMode::test_disable_enable_subclass, test/test_overrides.py::TestTorchFunctionMode::test_disable_enable_torch_function_ctx, test/test_overrides.py::TestTorchFunctionMode::test_disable_subclass_mode, test/test_overrides.py::TestTorchFunctionMode::test_disable_subclass_not_mode, test/test_overrides.py::TestTorchFunctionMode::test_distributions_bernoulli, test/test_overrides.py::TestTorchFunctionMode::test_error_using_class_method_on_mode, test/test_overrides.py::TestTorchFunctionMode::test_factory_override, test/test_overrides.py::TestTorchFunctionMode::test_get_cur_mode, test/test_overrides.py::TestTorchFunctionMode::test_get_mode_stack, test/test_overrides.py::TestTorchFunctionMode::test_getitem_call, test/test_overrides.py::TestTorchFunctionMode::test_mode_notimplemented_loop, test/test_overrides.py::TestTorchFunctionMode::test_modes_handle_first, test/test_overrides.py::TestTorchFunctionMode::test_modes_return_notimplemented, test/test_overrides.py::TestTorchFunctionMode::test_nested_modes_with_python_has_torch_function, test/test_overrides.py::TestTorchFunctionMode::test_nested_same_mode, test/test_overrides.py::TestTorchFunctionMode::test_nn_parse_to, test/test_overrides.py::TestTorchFunctionMode::test_reentrant_mode_idiom, test/test_overrides.py::TestTorchFunctionMode::test_restacking_with_ancestor, test/test_overrides.py::TestTorchFunctionMode::test_subclass_hash, test/test_overrides.py::TestTorchFunctionMode::test_torch_function_all_disabled_api, test/test_overrides.py::TestTorchFunctionMode::test_with_mode, test/test_overrides.py::TestTorchFunctionMode::test_with_mode_created_separately, test/test_overrides.py::TestTorchFunctionMode::test_with_nested_modes 2025-10-10T01:41:30.1630613Z 2025-10-10T01:41:30.1630805Z Running distributions/test_distributions 1/1 ... [2025-10-10 01:41:30.068099] 2025-10-10T01:41:30.1631144Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:41:30.1631898Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributions/test_distributions.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:41:30.068650] 2025-10-10T01:42:30.2097183Z 2025-10-10T01:42:30.2098689Z distributions/test_distributions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributions.test_distributions_1.1_524b875d44305f49_.log 2025-10-10T01:42:30.2205868Z Running 230 items in this shard: test/distributions/test_distributions.py::TestDistributions::test_argmax_relaxed_categorical, test/distributions/test_distributions.py::TestDistributions::test_bernoulli, test/distributions/test_distributions.py::TestDistributions::test_bernoulli_3d, test/distributions/test_distributions.py::TestDistributions::test_bernoulli_enumerate_support, test/distributions/test_distributions.py::TestDistributions::test_beta_log_prob, test/distributions/test_distributions.py::TestDistributions::test_beta_sample, test/distributions/test_distributions.py::TestDistributions::test_beta_shape, test/distributions/test_distributions.py::TestDistributions::test_beta_underflow, test/distributions/test_distributions.py::TestDistributions::test_beta_underflow_gpu, test/distributions/test_distributions.py::TestDistributions::test_binomial, test/distributions/test_distributions.py::TestDistributions::test_binomial_bfloat16, test/distributions/test_distributions.py::TestDistributions::test_binomial_enumerate_support, test/distributions/test_distributions.py::TestDistributions::test_binomial_extreme_vals, test/distributions/test_distributions.py::TestDistributions::test_binomial_half, test/distributions/test_distributions.py::TestDistributions::test_binomial_log_prob_and_entropy, test/distributions/test_distributions.py::TestDistributions::test_binomial_log_prob_vectorized_count, test/distributions/test_distributions.py::TestDistributions::test_binomial_sample, test/distributions/test_distributions.py::TestDistributions::test_binomial_stable, test/distributions/test_distributions.py::TestDistributions::test_binomial_vectorized_count, test/distributions/test_distributions.py::TestDistributions::test_categorical_1d, test/distributions/test_distributions.py::TestDistributions::test_categorical_2d, test/distributions/test_distributions.py::TestDistributions::test_categorical_enumerate_support, test/distributions/test_distributions.py::TestDistributions::test_cauchy, test/distributions/test_distributions.py::TestDistributions::test_cdf_icdf_inverse, test/distributions/test_distributions.py::TestDistributions::test_cdf_log_prob, test/distributions/test_distributions.py::TestDistributions::test_chi2_sample, test/distributions/test_distributions.py::TestDistributions::test_chi2_shape, test/distributions/test_distributions.py::TestDistributions::test_continuous_bernoulli, test/distributions/test_distributions.py::TestDistributions::test_continuous_bernoulli_3d, test/distributions/test_distributions.py::TestDistributions::test_dirichlet_log_prob, test/distributions/test_distributions.py::TestDistributions::test_dirichlet_log_prob_zero, test/distributions/test_distributions.py::TestDistributions::test_dirichlet_mode, test/distributions/test_distributions.py::TestDistributions::test_dirichlet_sample, test/distributions/test_distributions.py::TestDistributions::test_dirichlet_shape, test/distributions/test_distributions.py::TestDistributions::test_distribution_expand, test/distributions/test_distributions.py::TestDistributions::test_distribution_subclass_expand, test/distributions/test_distributions.py::TestDistributions::test_enumerate_support_type, test/distributions/test_distributions.py::TestDistributions::test_exponential, test/distributions/test_distributions.py::TestDistributions::test_exponential_sample, test/distributions/test_distributions.py::TestDistributions::test_fishersnedecor, test/distributions/test_distributions.py::TestDistributions::test_fishersnedecor_sample, test/distributions/test_distributions.py::TestDistributions::test_gamma_gpu_sample, test/distributions/test_distributions.py::TestDistributions::test_gamma_gpu_shape, test/distributions/test_distributions.py::TestDistributions::test_gamma_log_prob_at_boundary, test/distributions/test_distributions.py::TestDistributions::test_gamma_sample, test/distributions/test_distributions.py::TestDistributions::test_gamma_shape, test/distributions/test_distributions.py::TestDistributions::test_generalized_pareto, test/distributions/test_distributions.py::TestDistributions::test_generalized_pareto_sample, test/distributions/test_distributions.py::TestDistributions::test_geometric, test/distributions/test_distributions.py::TestDistributions::test_geometric_log_prob_and_entropy, test/distributions/test_distributions.py::TestDistributions::test_geometric_sample, test/distributions/test_distributions.py::TestDistributions::test_gumbel, test/distributions/test_distributions.py::TestDistributions::test_gumbel_sample, test/distributions/test_distributions.py::TestDistributions::test_halfcauchy, test/distributions/test_distributions.py::TestDistributions::test_halfnormal, test/distributions/test_distributions.py::TestDistributions::test_halfnormal_logprob, test/distributions/test_distributions.py::TestDistributions::test_halfnormal_sample, test/distributions/test_distributions.py::TestDistributions::test_has_examples, test/distributions/test_distributions.py::TestDistributions::test_independent_expand, test/distributions/test_distributions.py::TestDistributions::test_independent_shape, test/distributions/test_distributions.py::TestDistributions::test_invalid_parameter_broadcasting, test/distributions/test_distributions.py::TestDistributions::test_inversegamma, test/distributions/test_distributions.py::TestDistributions::test_inversegamma_sample, test/distributions/test_distributions.py::TestDistributions::test_kumaraswamy_mean_variance, test/distributions/test_distributions.py::TestDistributions::test_kumaraswamy_shape, test/distributions/test_distributions.py::TestDistributions::test_laplace, test/distributions/test_distributions.py::TestDistributions::test_laplace_sample, test/distributions/test_distributions.py::TestDistributions::test_lazy_property_grad, test/distributions/test_distributions.py::TestDistributions::test_lkj_cholesky_log_prob, test/distributions/test_distributions.py::TestDistributions::test_logisticnormal, test/distributions/test_distributions.py::TestDistributions::test_logisticnormal_logprob, test/distributions/test_distributions.py::TestDistributions::test_logisticnormal_sample, test/distributions/test_distributions.py::TestDistributions::test_lognormal, test/distributions/test_distributions.py::TestDistributions::test_lognormal_logprob, test/distributions/test_distributions.py::TestDistributions::test_lognormal_sample, test/distributions/test_distributions.py::TestDistributions::test_lowrank_multivariate_normal_log_prob, test/distributions/test_distributions.py::TestDistributions::test_lowrank_multivariate_normal_moments, test/distributions/test_distributions.py::TestDistributions::test_lowrank_multivariate_normal_properties, test/distributions/test_distributions.py::TestDistributions::test_lowrank_multivariate_normal_sample, test/distributions/test_distributions.py::TestDistributions::test_lowrank_multivariate_normal_shape, test/distributions/test_distributions.py::TestDistributions::test_mixture_same_family_binomial_log_prob, test/distributions/test_distributions.py::TestDistributions::test_mixture_same_family_normal_log_prob, test/distributions/test_distributions.py::TestDistributions::test_mixture_same_family_sample, test/distributions/test_distributions.py::TestDistributions::test_mixture_same_family_shape, test/distributions/test_distributions.py::TestDistributions::test_mode, test/distributions/test_distributions.py::TestDistributions::test_multinomial_1d, test/distributions/test_distributions.py::TestDistributions::test_multinomial_1d_log_prob_and_entropy, test/distributions/test_distributions.py::TestDistributions::test_multinomial_2d, test/distributions/test_distributions.py::TestDistributions::test_multinomial_sequential_draw, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_log_prob, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_moments, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_properties, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_sample, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_shape, test/distributions/test_distributions.py::TestDistributions::test_multivariate_normal_stable_with_precision_matrix, test/distributions/test_distributions.py::TestDistributions::test_negative_binomial, test/distributions/test_distributions.py::TestDistributions::test_negative_binomial_log_prob, test/distributions/test_distributions.py::TestDistributions::test_negative_binomial_log_prob_vectorized_count, test/distributions/test_distributions.py::TestDistributions::test_normal, test/distributions/test_distributions.py::TestDistributions::test_normal_sample, test/distributions/test_distributions.py::TestDistributions::test_one_hot_categorical_1d, test/distributions/test_distributions.py::TestDistributions::test_one_hot_categorical_2d, test/distributions/test_distributions.py::TestDistributions::test_one_hot_categorical_enumerate_support, test/distributions/test_distributions.py::TestDistributions::test_pareto, test/distributions/test_distributions.py::TestDistributions::test_pareto_sample, test/distributions/test_distributions.py::TestDistributions::test_poisson_forward_ad, test/distributions/test_distributions.py::TestDistributions::test_poisson_gpu_sample, test/distributions/test_distributions.py::TestDistributions::test_poisson_log_prob, test/distributions/test_distributions.py::TestDistributions::test_poisson_sample, test/distributions/test_distributions.py::TestDistributions::test_poisson_shape, test/distributions/test_distributions.py::TestDistributions::test_relaxed_bernoulli, test/distributions/test_distributions.py::TestDistributions::test_relaxed_one_hot_categorical_1d, test/distributions/test_distributions.py::TestDistributions::test_relaxed_one_hot_categorical_2d, test/distributions/test_distributions.py::TestDistributions::test_repr, test/distributions/test_distributions.py::TestDistributions::test_rounded_relaxed_bernoulli, test/distributions/test_distributions.py::TestDistributions::test_rsample_requires_grad, test/distributions/test_distributions.py::TestDistributions::test_sample_detached, test/distributions/test_distributions.py::TestDistributions::test_studentT, test/distributions/test_distributions.py::TestDistributions::test_studentT_log_prob, test/distributions/test_distributions.py::TestDistributions::test_studentT_sample, test/distributions/test_distributions.py::TestDistributions::test_support_attributes, test/distributions/test_distributions.py::TestDistributions::test_torch_binomial_dtype_errors, test/distributions/test_distributions.py::TestDistributions::test_uniform, test/distributions/test_distributions.py::TestDistributions::test_valid_parameter_broadcasting, test/distributions/test_distributions.py::TestDistributions::test_vonmises_logprob, test/distributions/test_distributions.py::TestDistributions::test_vonmises_sample, test/distributions/test_distributions.py::TestDistributions::test_wishart_log_prob, test/distributions/test_distributions.py::TestDistributions::test_wishart_moments, test/distributions/test_distributions.py::TestDistributions::test_wishart_properties, test/distributions/test_distributions.py::TestDistributions::test_wishart_sample, test/distributions/test_distributions.py::TestDistributions::test_wishart_shape, test/distributions/test_distributions.py::TestDistributions::test_wishart_stable_with_precision_matrix, test/distributions/test_distributions.py::TestDistributions::test_zero_excluded_binomial, test/distributions/test_distributions.py::TestRsample::test_beta_wrt_alpha, test/distributions/test_distributions.py::TestRsample::test_beta_wrt_beta, test/distributions/test_distributions.py::TestRsample::test_chi2, test/distributions/test_distributions.py::TestRsample::test_dirichlet_multivariate, test/distributions/test_distributions.py::TestRsample::test_dirichlet_on_diagonal, test/distributions/test_distributions.py::TestRsample::test_dirichlet_tangent_field, test/distributions/test_distributions.py::TestRsample::test_gamma, test/distributions/test_distributions.py::TestDistributionShapes::test_bernoulli_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_bernoulli_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_beta_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_beta_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_binomial_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_binomial_shape_vectorized_n, test/distributions/test_distributions.py::TestDistributionShapes::test_categorical_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_cauchy_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_cauchy_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_chi2_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_chi2_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_continuous_bernoulli_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_continuous_bernoulli_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_dirichlet_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_entropy_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_exponential_shape_scalar_param, test/distributions/test_distributions.py::TestDistributionShapes::test_exponential_shape_tensor_param, test/distributions/test_distributions.py::TestDistributionShapes::test_gamma_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_gamma_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_geometric_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_geometric_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_gumbel_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_halfcauchy_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_halfcauchy_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_kumaraswamy_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_laplace_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_laplace_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_mixture_same_family_mean_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_mixture_same_family_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_multinomial_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_normal_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_normal_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_one_hot_categorical_shape, test/distributions/test_distributions.py::TestDistributionShapes::test_pareto_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_studentT_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_studentT_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_uniform_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_uniform_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_vonmises_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_vonmises_shape_tensor_params, test/distributions/test_distributions.py::TestDistributionShapes::test_weibull_scale_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_wishart_shape_scalar_params, test/distributions/test_distributions.py::TestDistributionShapes::test_wishart_shape_tensor_params, test/distributions/test_distributions.py::TestKL::test_entropy_exponential_family, test/distributions/test_distributions.py::TestKL::test_entropy_monte_carlo, test/distributions/test_distributions.py::TestKL::test_kl_edgecases, test/distributions/test_distributions.py::TestKL::test_kl_exponential_family, test/distributions/test_distributions.py::TestKL::test_kl_infinite, test/distributions/test_distributions.py::TestKL::test_kl_lowrank_multivariate_normal, test/distributions/test_distributions.py::TestKL::test_kl_lowrank_multivariate_normal_batched, test/distributions/test_distributions.py::TestKL::test_kl_monte_carlo, test/distributions/test_distributions.py::TestKL::test_kl_multivariate_normal, test/distributions/test_distributions.py::TestKL::test_kl_multivariate_normal_batched, test/distributions/test_distributions.py::TestKL::test_kl_multivariate_normal_batched_broadcasted, test/distributions/test_distributions.py::TestKL::test_kl_shape, test/distributions/test_distributions.py::TestKL::test_kl_transformed, test/distributions/test_distributions.py::TestConstraints::test_params_constraints, test/distributions/test_distributions.py::TestConstraints::test_support_constraints, test/distributions/test_distributions.py::TestNumericalStability::test_bernoulli_gradient, test/distributions/test_distributions.py::TestNumericalStability::test_bernoulli_with_logits_overflow, test/distributions/test_distributions.py::TestNumericalStability::test_bernoulli_with_logits_underflow, test/distributions/test_distributions.py::TestNumericalStability::test_categorical_log_prob, test/distributions/test_distributions.py::TestNumericalStability::test_categorical_log_prob_with_logits, test/distributions/test_distributions.py::TestNumericalStability::test_continuous_bernoulli_gradient, test/distributions/test_distributions.py::TestNumericalStability::test_continuous_bernoulli_with_logits_overflow, test/distributions/test_distributions.py::TestNumericalStability::test_continuous_bernoulli_with_logits_underflow, test/distributions/test_distributions.py::TestNumericalStability::test_multinomial_log_prob, test/distributions/test_distributions.py::TestNumericalStability::test_multinomial_log_prob_with_logits, test/distributions/test_distributions.py::TestLazyLogitsInitialization::test_lazy_logits_initialization, test/distributions/test_distributions.py::TestLazyLogitsInitialization::test_lazy_probs_initialization, test/distributions/test_distributions.py::TestAgainstScipy::test_cdf, test/distributions/test_distributions.py::TestAgainstScipy::test_icdf, test/distributions/test_distributions.py::TestAgainstScipy::test_mean, test/distributions/test_distributions.py::TestAgainstScipy::test_variance_stddev, test/distributions/test_distributions.py::TestFunctors::test_cat_event_dim, test/distributions/test_distributions.py::TestFunctors::test_cat_transform, test/distributions/test_distributions.py::TestFunctors::test_cat_transform_non_uniform, test/distributions/test_distributions.py::TestFunctors::test_stack_transform, test/distributions/test_distributions.py::TestValidation::test_invalid, test/distributions/test_distributions.py::TestValidation::test_invalid_log_probs_arg, test/distributions/test_distributions.py::TestValidation::test_valid, test/distributions/test_distributions.py::TestValidation::test_warning_unimplemented_constraints, test/distributions/test_distributions.py::TestJit::test_cdf, test/distributions/test_distributions.py::TestJit::test_entropy, test/distributions/test_distributions.py::TestJit::test_enumerate_support, test/distributions/test_distributions.py::TestJit::test_log_prob, test/distributions/test_distributions.py::TestJit::test_mean, test/distributions/test_distributions.py::TestJit::test_rsample, test/distributions/test_distributions.py::TestJit::test_sample, test/distributions/test_distributions.py::TestJit::test_variance 2025-10-10T01:42:30.2332713Z 2025-10-10T01:42:30.2332916Z Running test_multiprocessing_spawn 1/1 ... [2025-10-10 01:42:30.210565] 2025-10-10T01:42:30.2333311Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:42:30.2334228Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_multiprocessing_spawn.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:42:30.211169] 2025-10-10T01:44:49.7606374Z 2025-10-10T01:44:49.7612638Z test_multiprocessing_spawn 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_multiprocessing_spawn_1.1_fc4c8dec5e5094ae_.log 2025-10-10T01:44:49.7631646Z Running 31 items in this shard: test/test_multiprocessing_spawn.py::SpawnTest::test_exception_all, test/test_multiprocessing_spawn.py::SpawnTest::test_exception_raises, test/test_multiprocessing_spawn.py::SpawnTest::test_exception_single, test/test_multiprocessing_spawn.py::SpawnTest::test_first_argument_index, test/test_multiprocessing_spawn.py::SpawnTest::test_signal_raises, test/test_multiprocessing_spawn.py::SpawnTest::test_success, test/test_multiprocessing_spawn.py::SpawnTest::test_success_first_then_exception, test/test_multiprocessing_spawn.py::SpawnTest::test_success_non_blocking, test/test_multiprocessing_spawn.py::SpawnTest::test_terminate_exit_grace_period0, test/test_multiprocessing_spawn.py::SpawnTest::test_terminate_exit_grace_period_20, test/test_multiprocessing_spawn.py::SpawnTest::test_terminate_signal, test/test_multiprocessing_spawn.py::ForkTest::test_exception_all, test/test_multiprocessing_spawn.py::ForkTest::test_exception_single, test/test_multiprocessing_spawn.py::ForkTest::test_first_argument_index, test/test_multiprocessing_spawn.py::ForkTest::test_success, test/test_multiprocessing_spawn.py::ForkTest::test_success_first_then_exception, test/test_multiprocessing_spawn.py::ForkTest::test_success_non_blocking, test/test_multiprocessing_spawn.py::ForkTest::test_terminate_exit_grace_period0, test/test_multiprocessing_spawn.py::ForkTest::test_terminate_exit_grace_period_20, test/test_multiprocessing_spawn.py::ForkTest::test_terminate_signal, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_exception_all, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_exception_single, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_first_argument_index, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_success, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_success_first_then_exception, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_success_non_blocking, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_terminate_exit_grace_period0, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_terminate_exit_grace_period_20, test/test_multiprocessing_spawn.py::ParallelForkServerShouldWorkTest::test_terminate_signal, test/test_multiprocessing_spawn.py::ParallelForkServerPerfTest::test_forkserver_perf, test/test_multiprocessing_spawn.py::ErrorTest::test_errors_pickleable 2025-10-10T01:44:49.7649665Z 2025-10-10T01:44:49.7649955Z Running doctests 1/1 ... [2025-10-10 01:44:49.760857] 2025-10-10T01:44:50.2160035Z msg = Cannot scrape callname=Library.fallback in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=367. 2025-10-10T01:44:50.2161585Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.2162619Z Registers the function implementation as the fallback for the given key. 2025-10-10T01:44:50.2163224Z 2025-10-10T01:44:50.2163604Z This function only works for a library with global namespace ("_"). 2025-10-10T01:44:50.2164168Z 2025-10-10T01:44:50.2164339Z Args: 2025-10-10T01:44:50.2165048Z fn: function used as fallback for the given dispatch key or :func:`~fallthrough_kernel` 2025-10-10T01:44:50.2165903Z to register a fallthrough. 2025-10-10T01:44:50.2166826Z dispatch_key: dispatch key that the input function should be registered for. By default, it uses 2025-10-10T01:44:50.2167835Z the dispatch key that the library was created with. 2025-10-10T01:44:50.2168910Z with_keyset: flag controlling if the current dispatcher call keyset should be passed as the first argument 2025-10-10T01:44:50.2170232Z to :attr:`fn` when calling. This should be used to create the appropriate keyset for redispatch calls. 2025-10-10T01:44:50.2170949Z 2025-10-10T01:44:50.2171137Z Example:: 2025-10-10T01:44:50.2171391Z 2025-10-10T01:44:50.2171585Z >>> my_lib = Library("_", "IMPL") 2025-10-10T01:44:50.2172192Z >>> def fallback_kernel(op, *args, **kwargs): 2025-10-10T01:44:50.2172824Z >>> # Handle all autocast ops generically 2025-10-10T01:44:50.2173377Z >>> # ... 2025-10-10T01:44:50.2173895Z >>> my_lib.fallback(fallback_kernel, "Autocast") 2025-10-10T01:44:50.2174476Z 2025-10-10T01:44:50.2175695Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 5, 1, 'my_lib.fallback(fallback_kernel, "Autocast")\n', 5, 7)) 2025-10-10T01:44:50.2176896Z 2025-10-10T01:44:50.2177135Z my_lib.fallback(fallback_kernel, "Autocast") 2025-10-10T01:44:50.2177685Z ^ 2025-10-10T01:44:50.2274467Z msg = Cannot scrape callname=register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=942. 2025-10-10T01:44:50.2276043Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.2277077Z Register a FakeTensor implementation ("fake impl") for this operator. 2025-10-10T01:44:50.2277644Z 2025-10-10T01:44:50.2278003Z Also sometimes known as a "meta kernel", "abstract impl". 2025-10-10T01:44:50.2278488Z 2025-10-10T01:44:50.2278906Z An "FakeTensor implementation" specifies the behavior of this operator on 2025-10-10T01:44:50.2279888Z Tensors that carry no data ("FakeTensor"). Given some input Tensors with 2025-10-10T01:44:50.2280869Z certain properties (sizes/strides/storage_offset/device), it specifies 2025-10-10T01:44:50.2281702Z what the properties of the output Tensors are. 2025-10-10T01:44:50.2282648Z 2025-10-10T01:44:50.2283073Z The FakeTensor implementation has the same signature as the operator. 2025-10-10T01:44:50.2284018Z It is run for both FakeTensors and meta tensors. To write a FakeTensor 2025-10-10T01:44:50.2284911Z implementation, assume that all Tensor inputs to the operator are 2025-10-10T01:44:50.2285791Z regular CPU/CUDA/Meta tensors, but they do not have storage, and 2025-10-10T01:44:50.2286656Z you are trying to return regular CPU/CUDA/Meta tensor(s) as output. 2025-10-10T01:44:50.2287903Z The FakeTensor implementation must consist of only PyTorch operations 2025-10-10T01:44:50.2288795Z (and may not directly access the storage or data of any input or 2025-10-10T01:44:50.2289489Z intermediate Tensors). 2025-10-10T01:44:50.2289797Z 2025-10-10T01:44:50.2290077Z This API may be used as a decorator (see examples). 2025-10-10T01:44:50.2290513Z 2025-10-10T01:44:50.2290764Z For a detailed guide on custom ops, please see 2025-10-10T01:44:50.2291618Z https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-10-10T01:44:50.2292213Z 2025-10-10T01:44:50.2292372Z Args: 2025-10-10T01:44:50.2292969Z op_name: Operator name (along with the overload) or OpOverload object. 2025-10-10T01:44:50.2293737Z func: Fake tensor implementation. 2025-10-10T01:44:50.2294453Z lib (Optional[Library]): Library to register the fake tensor to. 2025-10-10T01:44:50.2295288Z allow_override: Flag controlling if we want to override an 2025-10-10T01:44:50.2296081Z existing registered fake impl. This is by default off, 2025-10-10T01:44:50.2296873Z and will error you're trying to register a fake impl to 2025-10-10T01:44:50.2297668Z an operator that already has a fake impl. This also only 2025-10-10T01:44:50.2298436Z applies if the custom operator was not created via 2025-10-10T01:44:50.2299228Z torch.library.custom_op, as overriding and existing fake 2025-10-10T01:44:50.2299938Z impl is already allowed. 2025-10-10T01:44:50.2300300Z 2025-10-10T01:44:50.2300464Z Examples: 2025-10-10T01:44:50.2300862Z >>> import torch 2025-10-10T01:44:50.2301333Z >>> import numpy as np 2025-10-10T01:44:50.2301825Z >>> from torch import Tensor 2025-10-10T01:44:50.2302313Z >>> 2025-10-10T01:44:50.2302836Z >>> # Example 1: an operator without data-dependent output shape 2025-10-10T01:44:50.2303716Z >>> @torch.library.custom_op("mylib::custom_linear", mutates_args=()) 2025-10-10T01:44:50.2304623Z >>> def custom_linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-10-10T01:44:50.2305485Z >>> raise NotImplementedError("Implementation goes here") 2025-10-10T01:44:50.2306115Z >>> 2025-10-10T01:44:50.2306612Z >>> @torch.library.register_fake("mylib::custom_linear") 2025-10-10T01:44:50.2307263Z >>> def _(x, weight, bias): 2025-10-10T01:44:50.2307772Z >>> assert x.dim() == 2 2025-10-10T01:44:50.2308293Z >>> assert weight.dim() == 2 2025-10-10T01:44:50.2308835Z >>> assert bias.dim() == 1 2025-10-10T01:44:50.2309396Z >>> assert x.shape[1] == weight.shape[1] 2025-10-10T01:44:50.2310027Z >>> assert weight.shape[0] == bias.shape[0] 2025-10-10T01:44:50.2310648Z >>> assert x.device == weight.device 2025-10-10T01:44:50.2311175Z >>> 2025-10-10T01:44:50.2311593Z >>> return (x @ weight.t()) + bias 2025-10-10T01:44:50.2312112Z >>> 2025-10-10T01:44:50.2312621Z >>> with torch._subclasses.fake_tensor.FakeTensorMode(): 2025-10-10T01:44:50.2313280Z >>> x = torch.randn(2, 3) 2025-10-10T01:44:50.2313799Z >>> w = torch.randn(3, 3) 2025-10-10T01:44:50.2314440Z >>> b = torch.randn(3) 2025-10-10T01:44:50.2315010Z >>> y = torch.ops.mylib.custom_linear(x, w, b) 2025-10-10T01:44:50.2315901Z >>> 2025-10-10T01:44:50.2316294Z >>> assert y.shape == (2, 3) 2025-10-10T01:44:50.2316773Z >>> 2025-10-10T01:44:50.2317262Z >>> # Example 2: an operator with data-dependent output shape 2025-10-10T01:44:50.2318102Z >>> @torch.library.custom_op("mylib::custom_nonzero", mutates_args=()) 2025-10-10T01:44:50.2318894Z >>> def custom_nonzero(x: Tensor) -> Tensor: 2025-10-10T01:44:50.2319487Z >>> x_np = x.numpy(force=True) 2025-10-10T01:44:50.2320394Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-10-10T01:44:50.2321027Z >>> return torch.tensor(res, device=x.device) 2025-10-10T01:44:50.2321568Z >>> 2025-10-10T01:44:50.2322096Z >>> @torch.library.register_fake("mylib::custom_nonzero") 2025-10-10T01:44:50.2322718Z >>> def _(x): 2025-10-10T01:44:50.2323234Z >>> # Number of nonzero-elements is data-dependent. 2025-10-10T01:44:50.2323941Z >>> # Since we cannot peek at the data in an fake impl, 2025-10-10T01:44:50.2324653Z >>> # we use the ctx object to construct a new symint that 2025-10-10T01:44:50.2325316Z >>> # represents the data-dependent size. 2025-10-10T01:44:50.2325917Z >>> ctx = torch.library.get_ctx() 2025-10-10T01:44:50.2326480Z >>> nnz = ctx.new_dynamic_size() 2025-10-10T01:44:50.2327017Z >>> shape = [nnz, x.dim()] 2025-10-10T01:44:50.2327612Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-10-10T01:44:50.2328223Z >>> return result 2025-10-10T01:44:50.2328661Z >>> 2025-10-10T01:44:50.2329168Z >>> from torch.fx.experimental.proxy_tensor import make_fx 2025-10-10T01:44:50.2329805Z >>> 2025-10-10T01:44:50.2330199Z >>> x = torch.tensor([0, 1, 2, 3, 4, 0]) 2025-10-10T01:44:50.2330977Z >>> trace = make_fx(torch.ops.mylib.custom_nonzero, tracing_mode="symbolic")(x) 2025-10-10T01:44:50.2331754Z >>> trace.print_readable() 2025-10-10T01:44:50.2332222Z >>> 2025-10-10T01:44:50.2332808Z >>> assert torch.allclose(trace(x), torch.ops.mylib.custom_nonzero(x)) 2025-10-10T01:44:50.2333375Z 2025-10-10T01:44:50.2333518Z 2025-10-10T01:44:50.2334565Z Original Error: IndentationError('expected an indented block after function definition on line 37', ('', 38, 1, '_._ = None\n', 38, 2)) 2025-10-10T01:44:50.2335589Z 2025-10-10T01:44:50.2335742Z _._ = None 2025-10-10T01:44:50.2336084Z ^ 2025-10-10T01:44:50.2386824Z msg = Cannot scrape callname=get_kernel in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=1476. 2025-10-10T01:44:50.2388375Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.2389370Z Returns the computed kernel for a given operator and dispatch key. 2025-10-10T01:44:50.2389904Z 2025-10-10T01:44:50.2390290Z This function retrieves the kernel that would be executed for a given 2025-10-10T01:44:50.2391251Z operator and dispatch key combination. The returned SafeKernelFunction 2025-10-10T01:44:50.2392161Z can be used to call the kernel in a boxed fashion. The intended use 2025-10-10T01:44:50.2393006Z case for this function is to retrieve the original kernel for a given 2025-10-10T01:44:50.2393892Z dispatch key and then register another kernel to the same dispatch key 2025-10-10T01:44:50.2394883Z that calls into the original kernel for certain cases. 2025-10-10T01:44:50.2395337Z 2025-10-10T01:44:50.2395491Z Args: 2025-10-10T01:44:50.2396039Z op: Operator name (along with the overload) or OpOverload object 2025-10-10T01:44:50.2396946Z Can be a string (e.g., "aten::add.Tensor"), an OpOverload, or a CustomOpDef. 2025-10-10T01:44:50.2397931Z dispatch_key (str | torch.DispatchKey): The dispatch key to get the kernel for. 2025-10-10T01:44:50.2398853Z Can be a string (e.g., "CPU", "CUDA") or a DispatchKey enum value. 2025-10-10T01:44:50.2399351Z 2025-10-10T01:44:50.2399513Z Returns: 2025-10-10T01:44:50.2400601Z torch._C._SafeKernelFunction: A safe kernel function that can be used to 2025-10-10T01:44:50.2401368Z call the kernel. 2025-10-10T01:44:50.2401668Z 2025-10-10T01:44:50.2401817Z Raises: 2025-10-10T01:44:50.2402267Z RuntimeError: If the operator does not exist. 2025-10-10T01:44:50.2402720Z 2025-10-10T01:44:50.2402879Z Example: 2025-10-10T01:44:50.2403297Z >>> # Get the CPU kernel for torch.add 2025-10-10T01:44:50.2403995Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", "CPU") 2025-10-10T01:44:50.2405051Z >>> 2025-10-10T01:44:50.2405467Z >>> # You can also use DispatchKey enum 2025-10-10T01:44:50.2406265Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", torch.DispatchKey.CPU) 2025-10-10T01:44:50.2407019Z >>> 2025-10-10T01:44:50.2407405Z >>> # Or use an OpOverload directly 2025-10-10T01:44:50.2408131Z >>> kernel = torch.library.get_kernel(torch.ops.aten.add.Tensor, "CPU") 2025-10-10T01:44:50.2408836Z >>> 2025-10-10T01:44:50.2409385Z >>> # Example: Using get_kernel in a custom op with conditional dispatch 2025-10-10T01:44:50.2410136Z >>> # Get the original kernel for torch.sin 2025-10-10T01:44:50.2410876Z >>> original_sin_kernel = torch.library.get_kernel("aten::sin", "CPU") 2025-10-10T01:44:50.2411545Z >>> 2025-10-10T01:44:50.2412117Z >>> # If input has negative values, use original sin, otherwise return zeros 2025-10-10T01:44:50.2412911Z >>> def conditional_sin_impl(dispatch_keys, x): 2025-10-10T01:44:50.2413485Z >>> if (x < 0).any(): 2025-10-10T01:44:50.2414106Z >>> return original_sin_kernel.call_boxed(dispatch_keys, x) 2025-10-10T01:44:50.2414742Z >>> else: 2025-10-10T01:44:50.2415182Z >>> return torch.zeros_like(x) 2025-10-10T01:44:50.2415682Z >>> 2025-10-10T01:44:50.2416111Z >>> lib = torch.library.Library("aten", "IMPL") 2025-10-10T01:44:50.2416946Z >>> # with_keyset=True so the first argument to the impl is the current DispatchKeySet 2025-10-10T01:44:50.2417867Z >>> which needs to be the first argument to ``kernel.call_boxed`` 2025-10-10T01:44:50.2418675Z >>> lib.impl("sin", conditional_sin_impl, "CPU", with_keyset=True) 2025-10-10T01:44:50.2419307Z >>> 2025-10-10T01:44:50.2419696Z >>> # Test the conditional behavior 2025-10-10T01:44:50.2420252Z >>> x_positive = torch.tensor([1.0, 2.0]) 2025-10-10T01:44:50.2420837Z >>> x_mixed = torch.tensor([-1.0, 2.0]) 2025-10-10T01:44:50.2421386Z >>> torch.sin(x_positive) 2025-10-10T01:44:50.2421871Z tensor([0., 0.]) 2025-10-10T01:44:50.2422319Z >>> torch.sin(x_mixed) 2025-10-10T01:44:50.2422789Z tensor([-0.8415, 0.9093]) 2025-10-10T01:44:50.2423233Z 2025-10-10T01:44:50.2424219Z Original Error: SyntaxError('invalid syntax', ('', 23, 7, 'which needs to be the first argument to ``kernel.call_boxed``\n', 23, 12)) 2025-10-10T01:44:50.2425197Z 2025-10-10T01:44:50.2425495Z which needs to be the first argument to ``kernel.call_boxed`` 2025-10-10T01:44:50.2426115Z ^ 2025-10-10T01:44:50.3604266Z msg = Cannot scrape callname=cudart in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py line=435. 2025-10-10T01:44:50.3605814Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.3606637Z Retrieves the CUDA runtime API module. 2025-10-10T01:44:50.3607051Z 2025-10-10T01:44:50.3607060Z 2025-10-10T01:44:50.3607478Z This function initializes the CUDA runtime environment if it is not already 2025-10-10T01:44:50.3608458Z initialized and returns the CUDA runtime API module (_cudart). The CUDA 2025-10-10T01:44:50.3609410Z runtime API module provides access to various CUDA runtime functions. 2025-10-10T01:44:50.3610014Z 2025-10-10T01:44:50.3610164Z Args: 2025-10-10T01:44:50.3610523Z ``None`` 2025-10-10T01:44:50.3610771Z 2025-10-10T01:44:50.3611411Z Returns: 2025-10-10T01:44:50.3611905Z module: The CUDA runtime API module (_cudart). 2025-10-10T01:44:50.3612343Z 2025-10-10T01:44:50.3612495Z Raises: 2025-10-10T01:44:50.3613114Z RuntimeError: If CUDA cannot be re-initialized in a forked subprocess. 2025-10-10T01:44:50.3614282Z AssertionError: If PyTorch is not compiled with CUDA support or if libcudart functions are unavailable. 2025-10-10T01:44:50.3615105Z 2025-10-10T01:44:50.3615339Z Example of CUDA operations with profiling: 2025-10-10T01:44:50.3616258Z >>> import torch 2025-10-10T01:44:50.3616792Z >>> from torch.cuda import cudart, check_error 2025-10-10T01:44:50.3617357Z >>> import os 2025-10-10T01:44:50.3617750Z >>> 2025-10-10T01:44:50.3618153Z >>> os.environ["CUDA_PROFILE"] = "1" 2025-10-10T01:44:50.3618663Z >>> 2025-10-10T01:44:50.3619099Z >>> def perform_cuda_operations_with_streams(): 2025-10-10T01:44:50.3619727Z >>> stream = torch.cuda.Stream() 2025-10-10T01:44:50.3620312Z >>> with torch.cuda.stream(stream): 2025-10-10T01:44:50.3620913Z >>> x = torch.randn(100, 100, device='cuda') 2025-10-10T01:44:50.3621519Z >>> y = torch.randn(100, 100, device='cuda') 2025-10-10T01:44:50.3622080Z >>> z = torch.mul(x, y) 2025-10-10T01:44:50.3622588Z >>> return z 2025-10-10T01:44:50.3622998Z >>> 2025-10-10T01:44:50.3623392Z >>> torch.cuda.synchronize() 2025-10-10T01:44:50.3623972Z >>> print("====== Start nsys profiling ======") 2025-10-10T01:44:50.3624607Z >>> check_error(cudart().cudaProfilerStart()) 2025-10-10T01:44:50.3625250Z >>> with torch.autograd.profiler.emit_nvtx(): 2025-10-10T01:44:50.3625916Z >>> result = perform_cuda_operations_with_streams() 2025-10-10T01:44:50.3626568Z >>> print("CUDA operations completed.") 2025-10-10T01:44:50.3627235Z >>> check_error(torch.cuda.cudart().cudaProfilerStop()) 2025-10-10T01:44:50.3627900Z >>> print("====== End nsys profiling ======") 2025-10-10T01:44:50.3628284Z 2025-10-10T01:44:50.3628625Z To run this example and save the profiling information, execute: 2025-10-10T01:44:50.3629766Z >>> $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-10-10T01:44:50.3630545Z 2025-10-10T01:44:50.3630969Z This command profiles the CUDA operations in the provided script and saves 2025-10-10T01:44:50.3631899Z the profiling information to a file named `trace_name.prof`. 2025-10-10T01:44:50.3632791Z The `--profile-from-start off` option ensures that profiling starts only 2025-10-10T01:44:50.3633604Z after the `cudaProfilerStart` call in the script. 2025-10-10T01:44:50.3634590Z The `--csv` and `--print-summary` options format the profiling output as a 2025-10-10T01:44:50.3635367Z CSV file and print a summary, respectively. 2025-10-10T01:44:50.3636173Z The `-o` option specifies the output file name, and the `-f` option forces the 2025-10-10T01:44:50.3637011Z overwrite of the output file if it already exists. 2025-10-10T01:44:50.3637589Z 2025-10-10T01:44:50.3638859Z Original Error: SyntaxError('invalid syntax', ('', 1, 1, '$ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py\n', 1, 2)) 2025-10-10T01:44:50.3640096Z 2025-10-10T01:44:50.3640675Z $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-10-10T01:44:50.3641587Z ^ 2025-10-10T01:44:50.4553187Z msg = Cannot scrape callname=is_available in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=66. 2025-10-10T01:44:50.4554954Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.4555994Z Check if the current accelerator is available at runtime: it was build, all the 2025-10-10T01:44:50.4557456Z required drivers are available and at least one device is visible. 2025-10-10T01:44:50.4558282Z See :ref:`accelerator` for details. 2025-10-10T01:44:50.4558729Z 2025-10-10T01:44:50.4558886Z Returns: 2025-10-10T01:44:50.4559587Z bool: A boolean indicating if there is an available :ref:`accelerator`. 2025-10-10T01:44:50.4560243Z 2025-10-10T01:44:50.4560675Z .. note:: This API delegates to the device-specific version of `is_available`. 2025-10-10T01:44:50.4561702Z On CUDA, when the environment variable ``PYTORCH_NVML_BASED_CUDA_CHECK=1`` is set, 2025-10-10T01:44:50.4563097Z this function will NOT poison fork. Otherwise, it will. For more details, see 2025-10-10T01:44:50.4563958Z :ref:`multiprocessing-poison-fork-note`. 2025-10-10T01:44:50.4564389Z 2025-10-10T01:44:50.4564548Z Example:: 2025-10-10T01:44:50.4564783Z 2025-10-10T01:44:50.4565230Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:44:50.4566011Z 2025-10-10T01:44:50.4567110Z Original Error: SyntaxError('invalid syntax', ('', 1, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 1, 78)) 2025-10-10T01:44:50.4568187Z 2025-10-10T01:44:50.4568611Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:44:50.4569394Z ^ 2025-10-10T01:44:50.4573368Z msg = Cannot scrape callname=synchronize in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=212. 2025-10-10T01:44:50.4574911Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:50.4575870Z Wait for all kernels in all streams on the given device to complete. 2025-10-10T01:44:50.4576418Z 2025-10-10T01:44:50.4576570Z Args: 2025-10-10T01:44:50.4577313Z device (:class:`torch.device`, str, int, optional): device for which to synchronize. It must match 2025-10-10T01:44:50.4578425Z the current :ref:`accelerator` device type. If not given, 2025-10-10T01:44:50.4579352Z use :func:`torch.accelerator.current_device_index` by default. 2025-10-10T01:44:50.4579892Z 2025-10-10T01:44:50.4580421Z .. note:: This function is a no-op if the current :ref:`accelerator` is not initialized. 2025-10-10T01:44:50.4581117Z 2025-10-10T01:44:50.4581278Z Example:: 2025-10-10T01:44:50.4581515Z 2025-10-10T01:44:50.4581761Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA) 2025-10-10T01:44:50.4582627Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:44:50.4583499Z >>> start_event = torch.Event(enable_timing=True) 2025-10-10T01:44:50.4584141Z >>> end_event = torch.Event(enable_timing=True) 2025-10-10T01:44:50.4584742Z >>> start_event.record() 2025-10-10T01:44:50.4585474Z >>> tensor = torch.randn(100, device=torch.accelerator.current_accelerator()) 2025-10-10T01:44:50.4586261Z >>> sum = torch.sum(tensor) 2025-10-10T01:44:50.4586774Z >>> end_event.record() 2025-10-10T01:44:50.4587299Z >>> torch.accelerator.synchronize() 2025-10-10T01:44:50.4587992Z >>> elapsed_time_ms = start_event.elapsed_time(end_event) 2025-10-10T01:44:50.4588614Z 2025-10-10T01:44:50.4589723Z Original Error: SyntaxError('invalid syntax', ('', 2, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 2, 78)) 2025-10-10T01:44:50.4590808Z 2025-10-10T01:44:50.4591242Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:44:50.4592027Z ^ 2025-10-10T01:44:51.0421677Z msg = Cannot scrape callname=unsafe_generate_fake_kernels in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_profile.py line=94. 2025-10-10T01:44:51.0423440Z Caused by: DoctestParseError('Failed to parse doctest in _label_docsrc_lines') 2025-10-10T01:44:51.0424601Z 2025-10-10T01:44:51.0425021Z Registers a fake kernel based on the given operator profiles. This fake 2025-10-10T01:44:51.0426023Z kernel registration will override any existing fake kernel registrations. 2025-10-10T01:44:51.0426626Z 2025-10-10T01:44:51.0426996Z The input is a dictionary mapping operator names to a set of operator 2025-10-10T01:44:51.0427927Z profiles, which we will use to generate fake kernels. The operator profiles 2025-10-10T01:44:51.0428834Z are a record of the input and output tensor metadata. Based on this 2025-10-10T01:44:51.0430058Z information we will match a given input to the recorded profile, and return 2025-10-10T01:44:51.0431017Z an output with the same metadata as in the recorded profile. If a profile 2025-10-10T01:44:51.0431839Z doesn't exist then an exception will be thrown. 2025-10-10T01:44:51.0432252Z 2025-10-10T01:44:51.0432646Z The fake kernel generation is considered unsafe because it relies on the 2025-10-10T01:44:51.0433609Z rigid, pre-defined operator profiles that do not account for potential 2025-10-10T01:44:51.0434725Z variations in output behavior. Specifically, the generated kernels assume a 2025-10-10T01:44:51.0435717Z fixed relationship between input and output ranks. However, in reality, it's 2025-10-10T01:44:51.0436708Z possible that data-dependent operations may produce outputs of different 2025-10-10T01:44:51.0437631Z ranks even when given inputs of the same rank. The generated fake kernels 2025-10-10T01:44:51.0438536Z are inflexible and unable to accommodate these nuances, making them 2025-10-10T01:44:51.0439252Z potentially unsafe. 2025-10-10T01:44:51.0439512Z 2025-10-10T01:44:51.0439661Z Args: 2025-10-10T01:44:51.0440225Z op_profiles (dict[str, set[OpProfile]]): A dictionary mapping operator 2025-10-10T01:44:51.0441092Z name to a set of operator profiles from which we will generate fake 2025-10-10T01:44:51.0441749Z kernels. 2025-10-10T01:44:51.0441982Z 2025-10-10T01:44:51.0442145Z Examples: 2025-10-10T01:44:51.0442350Z 2025-10-10T01:44:51.0442665Z >>> # Example: Registering an op-profile from draft-export 2025-10-10T01:44:51.0443284Z >>> import torch 2025-10-10T01:44:51.0443816Z >>> from torch.export._draft_export import draft_export 2025-10-10T01:44:51.0444413Z >>> 2025-10-10T01:44:51.0444933Z >>> @torch.library.custom_op("mylib::foo", mutates_args=()) 2025-10-10T01:44:51.0445617Z >>> def foo(x: Tensor, y: Tensor) -> Tensor: 2025-10-10T01:44:51.0446169Z >>> return x + y 2025-10-10T01:44:51.0446592Z >>> 2025-10-10T01:44:51.0446978Z >>> class M(torch.nn.Module): 2025-10-10T01:44:51.0447490Z >>> def forward(self, a, b): 2025-10-10T01:44:51.0448078Z >>> res = torch.ops.mylib.foo(a, b) # no fake impl 2025-10-10T01:44:51.0448668Z >>> return res 2025-10-10T01:44:51.0449089Z >>> 2025-10-10T01:44:51.0449590Z >>> ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)) 2025-10-10T01:44:51.0450209Z >>> 2025-10-10T01:44:51.0450908Z >>> with torch._library.fake_profile.unsafe_generate_fake_kernels(ep._report.op_profiles): 2025-10-10T01:44:51.0451800Z >>> decomp = ep.run_decompositions() 2025-10-10T01:44:51.0452185Z 2025-10-10T01:44:51.0452193Z 2025-10-10T01:44:51.0452972Z Original Error: IncompleteParseError('ill-formed doctest: all parts have been processed but the doctest source is not balanced') 2025-10-10T01:44:51.0453923Z 2025-10-10T01:44:51.0653526Z msg = Cannot scrape callname=CustomOpDef.register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py line=401. 2025-10-10T01:44:51.0655306Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:51.0656236Z Register a FakeTensor implementation for this custom op. 2025-10-10T01:44:51.0656718Z 2025-10-10T01:44:51.0657150Z This is necessary to get the operator to work efficiently with torch.compile. 2025-10-10T01:44:51.0657759Z 2025-10-10T01:44:51.0658679Z The Fake impl (sometimes also known as a meta kernel or abstract impl) 2025-10-10T01:44:51.0659628Z specifies the behavior of this operator on Tensors that carry no data. 2025-10-10T01:44:51.0660442Z Given some input Tensors with certain properties 2025-10-10T01:44:51.0661312Z (sizes/strides/storage_offset/device), it specifies what the properties of 2025-10-10T01:44:51.0662111Z the output Tensors are. 2025-10-10T01:44:51.0662434Z 2025-10-10T01:44:51.0662799Z Please see :func:`torch.library.register_fake` for more details. 2025-10-10T01:44:51.0663648Z 2025-10-10T01:44:51.0663811Z Args: 2025-10-10T01:44:51.0664334Z fn (Callable): The function to register as the FakeTensor 2025-10-10T01:44:51.0664974Z implementation. 2025-10-10T01:44:51.0665299Z 2025-10-10T01:44:51.0665465Z Examples: 2025-10-10T01:44:51.0665875Z >>> import torch 2025-10-10T01:44:51.0666342Z >>> import numpy as np 2025-10-10T01:44:51.0666873Z >>> from torch import Tensor 2025-10-10T01:44:51.0667373Z >>> 2025-10-10T01:44:51.0667928Z >>> # Example 1: an operator without data-dependent output shape 2025-10-10T01:44:51.0672982Z >>> @torch.library.custom_op("mylib::linear", mutates_args=()) 2025-10-10T01:44:51.0673892Z >>> def linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-10-10T01:44:51.0674797Z >>> return (x @ weight.t()) + bias 2025-10-10T01:44:51.0675336Z >>> 2025-10-10T01:44:51.0675781Z >>> @linear.register_fake 2025-10-10T01:44:51.0676325Z >>> def _(x, weight, bias): 2025-10-10T01:44:51.0676855Z >>> assert x.dim() == 2 2025-10-10T01:44:51.0677397Z >>> assert weight.dim() == 2 2025-10-10T01:44:51.0677948Z >>> assert bias.dim() == 1 2025-10-10T01:44:51.0678521Z >>> assert x.shape[1] == weight.shape[1] 2025-10-10T01:44:51.0679150Z >>> assert weight.shape[0] == bias.shape[0] 2025-10-10T01:44:51.0679785Z >>> assert x.device == weight.device 2025-10-10T01:44:51.0680431Z >>> return x.new_empty(x.size(0), weight.size(0)) 2025-10-10T01:44:51.0681039Z >>> 2025-10-10T01:44:51.0681470Z >>> x = torch.randn(2, 2) 2025-10-10T01:44:51.0682014Z >>> weight = torch.randn(2, 2) 2025-10-10T01:44:51.0682550Z >>> bias = torch.randn(2) 2025-10-10T01:44:51.0683132Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:44:51.0683889Z >>> out = torch.compile(linear, fullgraph=True)(x, weight, bias) 2025-10-10T01:44:51.0684627Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:44:51.0685449Z >>> assert torch.allclose(out, torch.nn.functional.linear(x, weight, bias)) 2025-10-10T01:44:51.0686201Z >>> 2025-10-10T01:44:51.0686731Z >>> # Example 2: an operator with data-dependent output shape 2025-10-10T01:44:51.0687566Z >>> @torch.library.custom_op("mylib::nonzero", mutates_args=()) 2025-10-10T01:44:51.0688289Z >>> def nonzero(x: Tensor) -> Tensor: 2025-10-10T01:44:51.0688860Z >>> x_np = x.cpu().numpy() 2025-10-10T01:44:51.0689425Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-10-10T01:44:51.0690056Z >>> return torch.tensor(res, device=x.device) 2025-10-10T01:44:51.0690609Z >>> 2025-10-10T01:44:51.0691016Z >>> @nonzero.register_fake 2025-10-10T01:44:51.0691535Z >>> def _(x): 2025-10-10T01:44:51.0692073Z >>> # Number of nonzero-elements is data-dependent. 2025-10-10T01:44:51.0692793Z >>> # Since we cannot peek at the data in an abstract impl, 2025-10-10T01:44:51.0693549Z >>> # we use the ctx object to construct a new symint that 2025-10-10T01:44:51.0694224Z >>> # represents the data-dependent size. 2025-10-10T01:44:51.0694834Z >>> ctx = torch.library.get_ctx() 2025-10-10T01:44:51.0695822Z >>> nnz = ctx.new_dynamic_size() 2025-10-10T01:44:51.0696391Z >>> shape = [nnz, x.dim()] 2025-10-10T01:44:51.0697004Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-10-10T01:44:51.0697610Z >>> return result 2025-10-10T01:44:51.0698066Z >>> 2025-10-10T01:44:51.0698494Z >>> x = torch.tensor([0, 1, 2, 0, 0, 1]) 2025-10-10T01:44:51.0699107Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:44:51.0700101Z >>> out = torch.compile(nonzero, fullgraph=True)(x) 2025-10-10T01:44:51.0700745Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:44:51.0701372Z >>> assert torch.allclose(out, x.nonzero()) 2025-10-10T01:44:51.0701772Z 2025-10-10T01:44:51.0701929Z 2025-10-10T01:44:51.0703015Z Original Error: IndentationError('expected an indented block after function definition on line 36', ('', 37, 1, '_._ = None\n', 37, 2)) 2025-10-10T01:44:51.0704048Z 2025-10-10T01:44:51.0704204Z _._ = None 2025-10-10T01:44:51.0704549Z ^ 2025-10-10T01:44:51.4009474Z msg = Cannot scrape callname=annotate in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py line=244. 2025-10-10T01:44:51.4010974Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:51.4011601Z 2025-10-10T01:44:51.4011985Z Temporarily adds custom annotations to the current tracing context. 2025-10-10T01:44:51.4012865Z The fx_node produced from this tracing context will have the 2025-10-10T01:44:51.4013629Z custom annotations in node.metadata["custom"] field. 2025-10-10T01:44:51.4014079Z 2025-10-10T01:44:51.4014492Z This context manager allows you to insert arbitrary metadata into the PT2 2025-10-10T01:44:51.4015467Z tracing system by updating the global `current_meta["custom"]` dictionary. 2025-10-10T01:44:51.4016393Z The annotations are automatically reverted after the context exits. 2025-10-10T01:44:51.4016945Z 2025-10-10T01:44:51.4017418Z This is intended for advanced users who need to attach additional metadata to the fx nodes 2025-10-10T01:44:51.4018462Z (e.g., for debugging, analysis, or external tooling) during export tracing. 2025-10-10T01:44:51.4019031Z 2025-10-10T01:44:51.4019175Z Note: 2025-10-10T01:44:51.4019750Z This API is **not backward compatible** and may evolve in future releases. 2025-10-10T01:44:51.4020323Z 2025-10-10T01:44:51.4020462Z Note: 2025-10-10T01:44:51.4021060Z This API is not compatible with fx.symbolic_trace or jit.trace. It's intended 2025-10-10T01:44:51.4021981Z to be used with PT2 family of tracers, e.g. torch.export and dynamo. 2025-10-10T01:44:51.4022506Z 2025-10-10T01:44:51.4022645Z Args: 2025-10-10T01:44:51.4023222Z annotation_dict (dict): A dictionary of custom key-value pairs to inject 2025-10-10T01:44:51.4023966Z into the FX trace metadata. 2025-10-10T01:44:51.4024311Z 2025-10-10T01:44:51.4024463Z Example: 2025-10-10T01:44:51.4024949Z >>> with annotate({"source": "custom_pass", "tag": 42}): 2025-10-10T01:44:51.4025563Z ... # compute here 2025-10-10T01:44:51.4026152Z # After exiting the context, custom annotations are removed. 2025-10-10T01:44:51.4026653Z 2025-10-10T01:44:51.4027538Z Original Error: IndentationError("expected an indented block after 'with' statement on line 1", ('', 2, 19, ' # compute here\n', 2, -1)) 2025-10-10T01:44:51.4028564Z 2025-10-10T01:44:51.4028717Z # compute here 2025-10-10T01:44:51.4029124Z ^ 2025-10-10T01:44:51.7975642Z msg = Cannot scrape callname=ReduceLROnPlateau in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py line=1587. 2025-10-10T01:44:51.7977305Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:51.7978236Z Reduce learning rate when a metric has stopped improving. 2025-10-10T01:44:51.7978749Z 2025-10-10T01:44:51.7979628Z Models often benefit from reducing the learning rate by a factor 2025-10-10T01:44:51.7980504Z of 2-10 once learning stagnates. This scheduler reads a metrics 2025-10-10T01:44:51.7981330Z quantity and if no improvement is seen for a 'patience' number 2025-10-10T01:44:51.7982058Z of epochs, the learning rate is reduced. 2025-10-10T01:44:51.7982441Z 2025-10-10T01:44:51.7982596Z Args: 2025-10-10T01:44:51.7983042Z optimizer (Optimizer): Wrapped optimizer. 2025-10-10T01:44:51.7983725Z mode (str): One of `min`, `max`. In `min` mode, lr will 2025-10-10T01:44:51.7984781Z be reduced when the quantity monitored has stopped 2025-10-10T01:44:51.7985503Z decreasing; in `max` mode it will be reduced when the 2025-10-10T01:44:51.7986291Z quantity monitored has stopped increasing. Default: 'min'. 2025-10-10T01:44:51.7987084Z factor (float): Factor by which the learning rate will be 2025-10-10T01:44:51.7987795Z reduced. new_lr = lr * factor. Default: 0.1. 2025-10-10T01:44:51.7988609Z patience (int): The number of allowed epochs with no improvement after 2025-10-10T01:44:51.7989397Z which the learning rate will be reduced. 2025-10-10T01:44:51.7990209Z For example, consider the case of having no patience (`patience = 0`). 2025-10-10T01:44:51.7991398Z In the first epoch, a baseline is established and is always considered good as there's no previous baseline. 2025-10-10T01:44:51.7992535Z In the second epoch, if the performance is worse than the baseline, 2025-10-10T01:44:51.7993335Z we have what is considered an intolerable epoch. 2025-10-10T01:44:51.7994570Z Since the count of intolerable epochs (1) is greater than the patience level (0), 2025-10-10T01:44:51.7995497Z the learning rate is reduced at the end of this epoch. 2025-10-10T01:44:51.7996520Z From the third epoch onwards, the learning rate continues to be reduced at the end of each epoch 2025-10-10T01:44:51.7997777Z if the performance is worse than the baseline. If the performance improves or remains the same, 2025-10-10T01:44:51.7998696Z the learning rate is not adjusted. 2025-10-10T01:44:51.7999248Z Default: 10. 2025-10-10T01:44:51.7999861Z threshold (float): Threshold for measuring the new optimum, 2025-10-10T01:44:51.8000639Z to only focus on significant changes. Default: 1e-4. 2025-10-10T01:44:51.8001384Z threshold_mode (str): One of `rel`, `abs`. In `rel` mode, 2025-10-10T01:44:51.8002125Z dynamic_threshold = best * ( 1 + threshold ) in 'max' 2025-10-10T01:44:51.8002821Z mode or best * ( 1 - threshold ) in `min` mode. 2025-10-10T01:44:51.8003512Z In `abs` mode, dynamic_threshold = best + threshold in 2025-10-10T01:44:51.8004257Z `max` mode or best - threshold in `min` mode. Default: 'rel'. 2025-10-10T01:44:51.8005039Z cooldown (int): Number of epochs to wait before resuming 2025-10-10T01:44:51.8005824Z normal operation after lr has been reduced. Default: 0. 2025-10-10T01:44:51.8006573Z min_lr (float or list): A scalar or a list of scalars. A 2025-10-10T01:44:51.8007293Z lower bound on the learning rate of all param groups 2025-10-10T01:44:51.8007980Z or each group respectively. Default: 0. 2025-10-10T01:44:51.8008730Z eps (float): Minimal decay applied to lr. If the difference 2025-10-10T01:44:51.8009520Z between new and old lr is smaller than eps, the update is 2025-10-10T01:44:51.8010207Z ignored. Default: 1e-8. 2025-10-10T01:44:51.8010542Z 2025-10-10T01:44:51.8010706Z Example: 2025-10-10T01:44:51.8011168Z >>> # xdoctest: +SKIP 2025-10-10T01:44:51.8011866Z >>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9) 2025-10-10T01:44:51.8012706Z >>> scheduler = ReduceLROnPlateau(optimizer, "min") 2025-10-10T01:44:51.8013337Z >>> for epoch in range(10): 2025-10-10T01:44:51.8014170Z >>> train(...) 2025-10-10T01:44:51.8014660Z >>> val_loss = validate(...) 2025-10-10T01:44:51.8015287Z >>> # Note that step should be called after validate() 2025-10-10T01:44:51.8015922Z >>> scheduler.step(val_loss) 2025-10-10T01:44:51.8016276Z 2025-10-10T01:44:51.8016653Z .. image:: ../scripts/lr_scheduler_images/ReduceLROnPlateau.png 2025-10-10T01:44:51.8017328Z 2025-10-10T01:44:51.8018212Z Original Error: IndentationError('unexpected indent', ('', 8, 4, ' scheduler.step(val_loss)\n', 8, -1)) 2025-10-10T01:44:51.8019457Z 2025-10-10T01:44:51.8019656Z scheduler.step(val_loss) 2025-10-10T01:44:51.8020115Z ^ 2025-10-10T01:44:54.1432288Z msg = Cannot scrape callname=ActivationSparsifier in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py line=16. 2025-10-10T01:44:54.1434719Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:54.1435388Z 2025-10-10T01:44:54.1435862Z The Activation sparsifier class aims to sparsify/prune activations in a neural 2025-10-10T01:44:54.1436883Z network. The idea is to attach the sparsifier to a layer (or layers) and it 2025-10-10T01:44:54.1437896Z zeroes out the activations based on the mask_fn (or sparsification function) 2025-10-10T01:44:54.1438692Z input by the user. 2025-10-10T01:44:54.1439320Z The mask_fn is applied once all the inputs are aggregated and reduced i.e. 2025-10-10T01:44:54.1440160Z mask = mask_fn(reduce_fn(aggregate_fn(activations))) 2025-10-10T01:44:54.1440594Z 2025-10-10T01:44:54.1440780Z Note:: 2025-10-10T01:44:54.1441514Z The sparsification mask is computed on the input **before it goes through the attached layer**. 2025-10-10T01:44:54.1442228Z 2025-10-10T01:44:54.1442380Z Args: 2025-10-10T01:44:54.1442745Z model (nn.Module): 2025-10-10T01:44:54.1443380Z The model whose layers will be sparsified. The layers that needs to be 2025-10-10T01:44:54.1444341Z sparsified should be added separately using the register_layer() function 2025-10-10T01:44:54.1445124Z aggregate_fn (Optional, Callable): 2025-10-10T01:44:54.1445913Z default aggregate_fn that is used if not specified while registering the layer. 2025-10-10T01:44:54.1446806Z specifies how inputs should be aggregated over time. 2025-10-10T01:44:54.1447740Z The aggregate_fn should usually take 2 torch tensors and return the aggregated tensor. 2025-10-10T01:44:54.1448556Z Example 2025-10-10T01:44:54.1449082Z def add_agg_fn(tensor1, tensor2): return tensor1 + tensor2 2025-10-10T01:44:54.1449768Z reduce_fn (Optional, Callable): 2025-10-10T01:44:54.1450547Z default reduce_fn that is used if not specified while registering the layer. 2025-10-10T01:44:54.1451582Z reduce_fn will be called on the aggregated tensor i.e. the tensor obtained after 2025-10-10T01:44:54.1452400Z calling agg_fn() on all inputs. 2025-10-10T01:44:54.1452945Z Example 2025-10-10T01:44:54.1453556Z def mean_reduce_fn(agg_tensor): return agg_tensor.mean(dim=0) 2025-10-10T01:44:54.1454273Z mask_fn (Optional, Callable): 2025-10-10T01:44:54.1455149Z default mask_fn that is used to create the sparsification mask using the tensor obtained after 2025-10-10T01:44:54.1456276Z calling the reduce_fn(). This is used by default if a custom one is passed in the 2025-10-10T01:44:54.1457076Z register_layer(). 2025-10-10T01:44:54.1457983Z Note that the mask_fn() definition should contain the sparse arguments that is passed in sparse_config 2025-10-10T01:44:54.1458906Z arguments. 2025-10-10T01:44:54.1459390Z features (Optional, list): 2025-10-10T01:44:54.1459973Z default selected features to sparsify. 2025-10-10T01:44:54.1461595Z If this is non-empty, then the mask_fn will be applied for each feature of the input. 2025-10-10T01:44:54.1462430Z For example, 2025-10-10T01:44:54.1463166Z mask = [mask_fn(reduce_fn(aggregated_fn(input[feature])) for feature in features] 2025-10-10T01:44:54.1463981Z feature_dim (Optional, int): 2025-10-10T01:44:54.1464819Z default dimension of input features. Again, features along this dim will be chosen 2025-10-10T01:44:54.1465657Z for sparsification. 2025-10-10T01:44:54.1466553Z sparse_config (Dict): 2025-10-10T01:44:54.1467250Z Default configuration for the mask_fn. This config will be passed 2025-10-10T01:44:54.1467981Z with the mask_fn() 2025-10-10T01:44:54.1468318Z 2025-10-10T01:44:54.1468470Z Example: 2025-10-10T01:44:54.1468853Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.1469318Z >>> model = SomeModel() 2025-10-10T01:44:54.1470026Z >>> act_sparsifier = ActivationSparsifier(...) # init activation sparsifier 2025-10-10T01:44:54.1470814Z >>> # Initialize aggregate_fn 2025-10-10T01:44:54.1471316Z >>> def agg_fn(x, y): 2025-10-10T01:44:54.1471759Z >>> return x + y 2025-10-10T01:44:54.1472185Z >>> 2025-10-10T01:44:54.1472568Z >>> # Initialize reduce_fn 2025-10-10T01:44:54.1473038Z >>> def reduce_fn(x): 2025-10-10T01:44:54.1473496Z >>> return torch.mean(x, dim=0) 2025-10-10T01:44:54.1474000Z >>> 2025-10-10T01:44:54.1474504Z >>> # Initialize mask_fn 2025-10-10T01:44:54.1474969Z >>> def mask_fn(data): 2025-10-10T01:44:54.1475510Z >>> return torch.eye(data.shape).to(data.device) 2025-10-10T01:44:54.1476078Z >>> 2025-10-10T01:44:54.1476411Z >>> 2025-10-10T01:44:54.1476800Z >>> act_sparsifier.register_layer( 2025-10-10T01:44:54.1477338Z ... model.some_layer, 2025-10-10T01:44:54.1477818Z ... aggregate_fn=agg_fn, 2025-10-10T01:44:54.1478308Z ... reduce_fn=reduce_fn, 2025-10-10T01:44:54.1478778Z ... mask_fn=mask_fn, 2025-10-10T01:44:54.1479223Z ... ) 2025-10-10T01:44:54.1479564Z >>> 2025-10-10T01:44:54.1479942Z >>> # start training process 2025-10-10T01:44:54.1480419Z >>> for _ in [...]: 2025-10-10T01:44:54.1480847Z >>> # epoch starts 2025-10-10T01:44:54.1481399Z >>> # model.forward(), compute_loss() and model.backwards() 2025-10-10T01:44:54.1482033Z >>> # epoch ends 2025-10-10T01:44:54.1482472Z >>> act_sparsifier.step() 2025-10-10T01:44:54.1482979Z >>> # end training process 2025-10-10T01:44:54.1483490Z >>> sparsifier.squash_mask() 2025-10-10T01:44:54.1483820Z 2025-10-10T01:44:54.1484649Z Original Error: IndentationError("expected an indented block after 'for' statement on line 25", ('', 26, 1, '_._ = None\n', 26, 2)) 2025-10-10T01:44:54.1485659Z 2025-10-10T01:44:54.1485808Z _._ = None 2025-10-10T01:44:54.1486168Z ^ 2025-10-10T01:44:54.4061687Z msg = Cannot scrape callname=vmap in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=39. 2025-10-10T01:44:54.4063244Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:54.4063866Z 2025-10-10T01:44:54.4064294Z vmap is the vectorizing map; ``vmap(func)`` returns a new function that 2025-10-10T01:44:54.4065193Z maps ``func`` over some dimension of the inputs. Semantically, vmap 2025-10-10T01:44:54.4066080Z pushes the map into PyTorch operations called by ``func``, effectively 2025-10-10T01:44:54.4066828Z vectorizing those operations. 2025-10-10T01:44:54.4067168Z 2025-10-10T01:44:54.4067539Z vmap is useful for handling batch dimensions: one can write a function 2025-10-10T01:44:54.4068405Z ``func`` that runs on examples and then lift it to a function that can 2025-10-10T01:44:54.4069256Z take batches of examples with ``vmap(func)``. vmap can also be used to 2025-10-10T01:44:54.4070065Z compute batched gradients when composed with autograd. 2025-10-10T01:44:54.4070524Z 2025-10-10T01:44:54.4070706Z .. note:: 2025-10-10T01:44:54.4071803Z :func:`torch.vmap` is aliased to :func:`torch.func.vmap` for 2025-10-10T01:44:54.4072535Z convenience. Use whichever one you'd like. 2025-10-10T01:44:54.4072924Z 2025-10-10T01:44:54.4073080Z Args: 2025-10-10T01:44:54.4073633Z func (function): A Python function that takes one or more arguments. 2025-10-10T01:44:54.4074496Z Must return one or more Tensors. 2025-10-10T01:44:54.4075174Z in_dims (int or nested structure): Specifies which dimension of the 2025-10-10T01:44:54.4075971Z inputs should be mapped over. ``in_dims`` should have a 2025-10-10T01:44:54.4077109Z structure like the inputs. If the ``in_dim`` for a particular 2025-10-10T01:44:54.4077897Z input is None, then that indicates there is no map dimension. 2025-10-10T01:44:54.4078518Z Default: 0. 2025-10-10T01:44:54.4079081Z out_dims (int or Tuple[int]): Specifies where the mapped dimension 2025-10-10T01:44:54.4079880Z should appear in the outputs. If ``out_dims`` is a Tuple, then 2025-10-10T01:44:54.4080641Z it should have one element per output. Default: 0. 2025-10-10T01:44:54.4081400Z randomness (str): Specifies whether the randomness in this 2025-10-10T01:44:54.4082236Z vmap should be the same or different across batches. If 'different', 2025-10-10T01:44:54.4083098Z the randomness for each batch will be different. If 'same', the 2025-10-10T01:44:54.4083959Z randomness will be the same across batches. If 'error', any calls to 2025-10-10T01:44:54.4084851Z random functions will error. Default: 'error'. WARNING: this flag 2025-10-10T01:44:54.4085727Z only applies to random PyTorch operations and does not apply to 2025-10-10T01:44:54.4086465Z Python's random module or numpy randomness. 2025-10-10T01:44:54.4087268Z chunk_size (None or int): If None (default), apply a single vmap over inputs. 2025-10-10T01:44:54.4088184Z If not None, then compute the vmap :attr:`chunk_size` samples at a time. 2025-10-10T01:44:54.4089172Z Note that :attr:`chunk_size=1` is equivalent to computing the vmap with a for-loop. 2025-10-10T01:44:54.4090262Z If you run into memory issues computing the vmap, please try a non-None chunk_size. 2025-10-10T01:44:54.4090892Z 2025-10-10T01:44:54.4091065Z Returns: 2025-10-10T01:44:54.4091591Z Returns a new "batched" function. It takes the same inputs as 2025-10-10T01:44:54.4092407Z ``func``, except each input has an extra dimension at the index 2025-10-10T01:44:54.4093236Z specified by ``in_dims``. It takes returns the same outputs as 2025-10-10T01:44:54.4094041Z ``func``, except each output has an extra dimension at the index 2025-10-10T01:44:54.4094715Z specified by ``out_dims``. 2025-10-10T01:44:54.4095024Z 2025-10-10T01:44:54.4095178Z .. warning: 2025-10-10T01:44:54.4095727Z :func:`vmap` works best with functional-style code. Please do not 2025-10-10T01:44:54.4096545Z perform any side-effects in ``func``, with the exception of 2025-10-10T01:44:54.4097430Z in-place PyTorch operations. Examples of side-effects include mutating 2025-10-10T01:44:54.4098372Z Python data structures and assigning values to variables not captured 2025-10-10T01:44:54.4099075Z in ``func``. 2025-10-10T01:44:54.4099326Z 2025-10-10T01:44:54.4099718Z One example of using :func:`vmap` is to compute batched dot products. PyTorch 2025-10-10T01:44:54.4100650Z doesn't provide a batched ``torch.dot`` API; instead of unsuccessfully 2025-10-10T01:44:54.4101554Z rummaging through docs, use :func:`vmap` to construct a new function. 2025-10-10T01:44:54.4102104Z 2025-10-10T01:44:54.4102316Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:44:54.4120319Z >>> batched_dot = torch.func.vmap(torch.dot) # [N, D], [N, D] -> [N] 2025-10-10T01:44:54.4121266Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-10-10T01:44:54.4121882Z >>> batched_dot(x, y) 2025-10-10T01:44:54.4122196Z 2025-10-10T01:44:54.4122626Z :func:`vmap` can be helpful in hiding batch dimensions, leading to a simpler 2025-10-10T01:44:54.4123853Z model authoring experience. 2025-10-10T01:44:54.4124186Z 2025-10-10T01:44:54.4124398Z >>> batch_size, feature_size = 3, 5 2025-10-10T01:44:54.4125069Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-10-10T01:44:54.4125700Z >>> 2025-10-10T01:44:54.4126069Z >>> def model(feature_vec): 2025-10-10T01:44:54.4126601Z >>> # Very simple linear model with activation 2025-10-10T01:44:54.4127234Z >>> return feature_vec.dot(weights).relu() 2025-10-10T01:44:54.4127768Z >>> 2025-10-10T01:44:54.4128538Z >>> examples = torch.randn(batch_size, feature_size) 2025-10-10T01:44:54.4129194Z >>> result = torch.vmap(model)(examples) 2025-10-10T01:44:54.4129574Z 2025-10-10T01:44:54.4129989Z :func:`vmap` can also help vectorize computations that were previously difficult 2025-10-10T01:44:54.4130987Z or impossible to batch. One example is higher-order gradient computation. 2025-10-10T01:44:54.4131957Z The PyTorch autograd engine computes vjps (vector-Jacobian products). 2025-10-10T01:44:54.4132952Z Computing a full Jacobian matrix for some function f: R^N -> R^N usually 2025-10-10T01:44:54.4133941Z requires N calls to ``autograd.grad``, one per Jacobian row. Using :func:`vmap`, 2025-10-10T01:44:54.4134940Z we can vectorize the whole computation, computing the Jacobian in a single 2025-10-10T01:44:54.4135700Z call to ``autograd.grad``. 2025-10-10T01:44:54.4136016Z 2025-10-10T01:44:54.4136169Z >>> # Setup 2025-10-10T01:44:54.4136552Z >>> N = 5 2025-10-10T01:44:54.4136947Z >>> f = lambda x: x**2 2025-10-10T01:44:54.4137452Z >>> x = torch.randn(N, requires_grad=True) 2025-10-10T01:44:54.4137991Z >>> y = f(x) 2025-10-10T01:44:54.4138397Z >>> I_N = torch.eye(N) 2025-10-10T01:44:54.4138822Z >>> 2025-10-10T01:44:54.4139205Z >>> # Sequential approach 2025-10-10T01:44:54.4139879Z >>> jacobian_rows = [torch.autograd.grad(y, x, v, retain_graph=True)[0] 2025-10-10T01:44:54.4140608Z >>> for v in I_N.unbind()] 2025-10-10T01:44:54.4141175Z >>> jacobian = torch.stack(jacobian_rows) 2025-10-10T01:44:54.4141693Z >>> 2025-10-10T01:44:54.4142091Z >>> # vectorized gradient computation 2025-10-10T01:44:54.4142620Z >>> def get_vjp(v): 2025-10-10T01:44:54.4143100Z >>> return torch.autograd.grad(y, x, v) 2025-10-10T01:44:54.4143676Z >>> jacobian = torch.vmap(get_vjp)(I_N) 2025-10-10T01:44:54.4144045Z 2025-10-10T01:44:54.4144494Z :func:`vmap` can also be nested, producing an output with multiple batched dimensions 2025-10-10T01:44:54.4145141Z 2025-10-10T01:44:54.4145321Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:44:54.4145840Z >>> batched_dot = torch.vmap( 2025-10-10T01:44:54.4146356Z ... torch.vmap(torch.dot) 2025-10-10T01:44:54.4146873Z ... ) # [N1, N0, D], [N1, N0, D] -> [N1, N0] 2025-10-10T01:44:54.4147485Z >>> x, y = torch.randn(2, 3, 5), torch.randn(2, 3, 5) 2025-10-10T01:44:54.4148110Z >>> batched_dot(x, y) # tensor of size [2, 3] 2025-10-10T01:44:54.4148498Z 2025-10-10T01:44:54.4148917Z If the inputs are not batched along the first dimension, ``in_dims`` specifies 2025-10-10T01:44:54.4149766Z the dimension that each inputs are batched along as 2025-10-10T01:44:54.4150202Z 2025-10-10T01:44:54.4150392Z >>> torch.dot # [N], [N] -> [] 2025-10-10T01:44:54.4151082Z >>> batched_dot = torch.vmap(torch.dot, in_dims=1) # [N, D], [N, D] -> [D] 2025-10-10T01:44:54.4151843Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-10-10T01:44:54.4152387Z >>> batched_dot( 2025-10-10T01:44:54.4152799Z ... x, y 2025-10-10T01:44:54.4153359Z ... ) # output is [5] instead of [2] if batched along the 0th dimension 2025-10-10T01:44:54.4153861Z 2025-10-10T01:44:54.4154658Z If there are multiple inputs each of which is batched along different dimensions, 2025-10-10T01:44:54.4155638Z ``in_dims`` must be a tuple with the batch dimension for each input as 2025-10-10T01:44:54.4156154Z 2025-10-10T01:44:54.4156341Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:44:54.4157394Z >>> batched_dot = torch.vmap(torch.dot, in_dims=(0, None)) # [N, D], [D] -> [N] 2025-10-10T01:44:54.4158195Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-10-10T01:44:54.4158735Z >>> batched_dot( 2025-10-10T01:44:54.4159135Z ... x, y 2025-10-10T01:44:54.4159677Z ... ) # second arg doesn't have a batch dim because in_dim[1] was None 2025-10-10T01:44:54.4160188Z 2025-10-10T01:44:54.4160586Z If the input is a Python struct, ``in_dims`` must be a tuple containing a struct 2025-10-10T01:44:54.4161663Z matching the shape of the input: 2025-10-10T01:44:54.4161997Z 2025-10-10T01:44:54.4162248Z >>> f = lambda dict: torch.dot(dict["x"], dict["y"]) 2025-10-10T01:44:54.4162860Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-10-10T01:44:54.4163410Z >>> input = {"x": x, "y": y} 2025-10-10T01:44:54.4164019Z >>> batched_dot = torch.vmap(f, in_dims=({"x": 0, "y": None},)) 2025-10-10T01:44:54.4164682Z >>> batched_dot(input) 2025-10-10T01:44:54.4164981Z 2025-10-10T01:44:54.4165451Z By default, the output is batched along the first dimension. However, it can be batched 2025-10-10T01:44:54.4166303Z along any dimension by using ``out_dims`` 2025-10-10T01:44:54.4166678Z 2025-10-10T01:44:54.4166852Z >>> f = lambda x: x**2 2025-10-10T01:44:54.4167300Z >>> x = torch.randn(2, 5) 2025-10-10T01:44:54.4167818Z >>> batched_pow = torch.vmap(f, out_dims=1) 2025-10-10T01:44:54.4168390Z >>> batched_pow(x) # [5, 2] 2025-10-10T01:44:54.4168699Z 2025-10-10T01:44:54.4169200Z For any function that uses kwargs, the returned function will not batch the kwargs but will 2025-10-10T01:44:54.4170033Z accept kwargs 2025-10-10T01:44:54.4170264Z 2025-10-10T01:44:54.4170446Z >>> x = torch.randn([2, 5]) 2025-10-10T01:44:54.4170924Z >>> def fn(x, scale=4.): 2025-10-10T01:44:54.4171393Z >>> return x * scale 2025-10-10T01:44:54.4171815Z >>> 2025-10-10T01:44:54.4172203Z >>> batched_pow = torch.vmap(fn) 2025-10-10T01:44:54.4172793Z >>> assert torch.allclose(batched_pow(x), x * 4) 2025-10-10T01:44:54.4173607Z >>> batched_pow(x, scale=x) # scale is not batched, output has shape [2, 2, 5] 2025-10-10T01:44:54.4174173Z 2025-10-10T01:44:54.4174348Z .. note:: 2025-10-10T01:44:54.4174925Z vmap does not provide general autobatching or handle variable-length 2025-10-10T01:44:54.4175658Z sequences out of the box. 2025-10-10T01:44:54.4175964Z 2025-10-10T01:44:54.4176849Z Original Error: IndentationError('expected an indented block after function definition on line 4', ('', 5, 1, '_._ = None\n', 5, 2)) 2025-10-10T01:44:54.4177877Z 2025-10-10T01:44:54.4178037Z _._ = None 2025-10-10T01:44:54.4178396Z ^ 2025-10-10T01:44:54.4179435Z msg = Cannot scrape callname=grad in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=306. 2025-10-10T01:44:54.4180797Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:54.4181796Z ``grad`` operator helps computing gradients of ``func`` with respect to the 2025-10-10T01:44:54.4182721Z input(s) specified by ``argnums``. This operator can be nested to 2025-10-10T01:44:54.4183438Z compute higher-order gradients. 2025-10-10T01:44:54.4183790Z 2025-10-10T01:44:54.4183949Z Args: 2025-10-10T01:44:54.4184521Z func (Callable): A Python function that takes one or more arguments. 2025-10-10T01:44:54.4185502Z Must return a single-element Tensor. If specified ``has_aux`` equals ``True``, 2025-10-10T01:44:54.4186572Z function can return a tuple of single-element Tensor and other auxiliary objects: 2025-10-10T01:44:54.4187393Z ``(output, aux)``. 2025-10-10T01:44:54.4188157Z argnums (int or Tuple[int]): Specifies arguments to compute gradients with respect to. 2025-10-10T01:44:54.4189140Z ``argnums`` can be single integer or tuple of integers. Default: 0. 2025-10-10T01:44:54.4190046Z has_aux (bool): Flag indicating that ``func`` returns a tensor and other 2025-10-10T01:44:54.4191158Z auxiliary objects: ``(output, aux)``. Default: False. 2025-10-10T01:44:54.4191625Z 2025-10-10T01:44:54.4191789Z Returns: 2025-10-10T01:44:54.4192472Z Function to compute gradients with respect to its inputs. By default, the output of 2025-10-10T01:44:54.4193516Z the function is the gradient tensor(s) with respect to the first argument. 2025-10-10T01:44:54.4194645Z If specified ``has_aux`` equals ``True``, tuple of gradients and output auxiliary objects 2025-10-10T01:44:54.4196004Z is returned. If ``argnums`` is a tuple of integers, a tuple of output gradients with 2025-10-10T01:44:54.4196851Z respect to each ``argnums`` value is returned. 2025-10-10T01:44:54.4197273Z 2025-10-10T01:44:54.4197468Z Example of using ``grad``: 2025-10-10T01:44:54.4197782Z 2025-10-10T01:44:54.4197972Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.4198483Z >>> from torch.func import grad 2025-10-10T01:44:54.4199021Z >>> x = torch.randn([]) 2025-10-10T01:44:54.4199563Z >>> cos_x = grad(lambda x: torch.sin(x))(x) 2025-10-10T01:44:54.4200178Z >>> assert torch.allclose(cos_x, x.cos()) 2025-10-10T01:44:54.4200720Z >>> 2025-10-10T01:44:54.4201107Z >>> # Second-order gradients 2025-10-10T01:44:54.4201692Z >>> neg_sin_x = grad(grad(lambda x: torch.sin(x)))(x) 2025-10-10T01:44:54.4202358Z >>> assert torch.allclose(neg_sin_x, -x.sin()) 2025-10-10T01:44:54.4202761Z 2025-10-10T01:44:54.4203193Z When composed with ``vmap``, ``grad`` can be used to compute per-sample-gradients: 2025-10-10T01:44:54.4203804Z 2025-10-10T01:44:54.4203985Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.4204509Z >>> from torch.func import grad, vmap 2025-10-10T01:44:54.4205083Z >>> batch_size, feature_size = 3, 5 2025-10-10T01:44:54.4205593Z >>> 2025-10-10T01:44:54.4206005Z >>> def model(weights, feature_vec): 2025-10-10T01:44:54.4206588Z >>> # Very simple linear model with activation 2025-10-10T01:44:54.4207199Z >>> assert feature_vec.dim() == 1 2025-10-10T01:44:54.4207775Z >>> return feature_vec.dot(weights).relu() 2025-10-10T01:44:54.4208302Z >>> 2025-10-10T01:44:54.4208735Z >>> def compute_loss(weights, example, target): 2025-10-10T01:44:54.4209346Z >>> y = model(weights, example) 2025-10-10T01:44:54.4209947Z >>> return ((y - target) ** 2).mean() # MSELoss 2025-10-10T01:44:54.4210508Z >>> 2025-10-10T01:44:54.4211027Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-10-10T01:44:54.4211779Z >>> examples = torch.randn(batch_size, feature_size) 2025-10-10T01:44:54.4212420Z >>> targets = torch.randn(batch_size) 2025-10-10T01:44:54.4212997Z >>> inputs = (weights, examples, targets) 2025-10-10T01:44:54.4213784Z >>> grad_weight_per_example = vmap(grad(compute_loss), in_dims=(None, 0, 0))( 2025-10-10T01:44:54.4214536Z ... *inputs 2025-10-10T01:44:54.4214954Z ... ) 2025-10-10T01:44:54.4215191Z 2025-10-10T01:44:54.4215492Z Example of using ``grad`` with ``has_aux`` and ``argnums``: 2025-10-10T01:44:54.4215976Z 2025-10-10T01:44:54.4216146Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.4216640Z >>> from torch.func import grad 2025-10-10T01:44:54.4217195Z >>> def my_loss_func(y, y_pred): 2025-10-10T01:44:54.4217787Z >>> loss_per_sample = (0.5 * y_pred - y) ** 2 2025-10-10T01:44:54.4218385Z >>> loss = loss_per_sample.mean() 2025-10-10T01:44:54.4218979Z >>> return loss, (y_pred, loss_per_sample) 2025-10-10T01:44:54.4219514Z >>> 2025-10-10T01:44:54.4219972Z >>> fn = grad(my_loss_func, argnums=(0, 1), has_aux=True) 2025-10-10T01:44:54.4220587Z >>> y_true = torch.rand(4) 2025-10-10T01:44:54.4221139Z >>> y_preds = torch.rand(4, requires_grad=True) 2025-10-10T01:44:54.4221722Z >>> out = fn(y_true, y_preds) 2025-10-10T01:44:54.4222801Z >>> # > output is ((grads w.r.t y_true, grads w.r.t y_preds), (y_pred, loss_per_sample)) 2025-10-10T01:44:54.4223418Z 2025-10-10T01:44:54.4223574Z .. note:: 2025-10-10T01:44:54.4224091Z Using PyTorch ``torch.no_grad`` together with ``grad``. 2025-10-10T01:44:54.4224573Z 2025-10-10T01:44:54.4224832Z Case 1: Using ``torch.no_grad`` inside a function: 2025-10-10T01:44:54.4225278Z 2025-10-10T01:44:54.4225455Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.4225950Z >>> def f(x): 2025-10-10T01:44:54.4226418Z >>> with torch.no_grad(): 2025-10-10T01:44:54.4227249Z >>> c = x ** 2 2025-10-10T01:44:54.4227744Z >>> return x - c 2025-10-10T01:44:54.4228055Z 2025-10-10T01:44:54.4228403Z In this case, ``grad(f)(x)`` will respect the inner ``torch.no_grad``. 2025-10-10T01:44:54.4228935Z 2025-10-10T01:44:54.4229249Z Case 2: Using ``grad`` inside ``torch.no_grad`` context manager: 2025-10-10T01:44:54.4229758Z 2025-10-10T01:44:54.4229935Z >>> # xdoctest: +SKIP 2025-10-10T01:44:54.4230447Z >>> with torch.no_grad(): 2025-10-10T01:44:54.4230957Z >>> grad(f)(x) 2025-10-10T01:44:54.4231258Z 2025-10-10T01:44:54.4231637Z In this case, ``grad`` will respect the inner ``torch.no_grad``, but not the 2025-10-10T01:44:54.4232584Z outer one. This is because ``grad`` is a "function transform": its result 2025-10-10T01:44:54.4233500Z should not depend on the result of a context manager outside of ``f``. 2025-10-10T01:44:54.4234067Z 2025-10-10T01:44:54.4234311Z 2025-10-10T01:44:54.4235365Z Original Error: IndentationError('expected an indented block after function definition on line 5', ('', 6, 1, '_._ = None\n', 6, 2)) 2025-10-10T01:44:54.4236384Z 2025-10-10T01:44:54.4236538Z _._ = None 2025-10-10T01:44:54.4236895Z ^ 2025-10-10T01:44:58.2119986Z msg = Cannot scrape callname=register_parametrization in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrize.py line=438. 2025-10-10T01:44:58.2121817Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:58.2122735Z Register a parametrization to a tensor in a module. 2025-10-10T01:44:58.2123210Z 2025-10-10T01:44:58.2123690Z Assume that ``tensor_name="weight"`` for simplicity. When accessing ``module.weight``, 2025-10-10T01:44:58.2124816Z the module will return the parametrized version ``parametrization(module.weight)``. 2025-10-10T01:44:58.2125924Z If the original tensor requires a gradient, the backward pass will differentiate 2025-10-10T01:44:58.2127088Z through :attr:`parametrization`, and the optimizer will update the tensor accordingly. 2025-10-10T01:44:58.2127756Z 2025-10-10T01:44:58.2128291Z The first time that a module registers a parametrization, this function will add an attribute 2025-10-10T01:44:58.2129429Z ``parametrizations`` to the module of type :class:`~ParametrizationList`. 2025-10-10T01:44:58.2130022Z 2025-10-10T01:44:58.2130469Z The list of parametrizations on the tensor ``weight`` will be accessible under 2025-10-10T01:44:58.2131314Z ``module.parametrizations.weight``. 2025-10-10T01:44:58.2131711Z 2025-10-10T01:44:58.2131958Z The original tensor will be accessible under 2025-10-10T01:44:58.2132635Z ``module.parametrizations.weight.original``. 2025-10-10T01:44:58.2133065Z 2025-10-10T01:44:58.2133518Z Parametrizations may be concatenated by registering several parametrizations 2025-10-10T01:44:58.2134331Z on the same attribute. 2025-10-10T01:44:58.2134639Z 2025-10-10T01:44:58.2135059Z The training mode of a registered parametrization is updated on registration 2025-10-10T01:44:58.2135878Z to match the training mode of the host module 2025-10-10T01:44:58.2136288Z 2025-10-10T01:44:58.2136814Z Parametrized parameters and buffers have an inbuilt caching system that can be activated 2025-10-10T01:44:58.2137729Z using the context manager :func:`cached`. 2025-10-10T01:44:58.2138110Z 2025-10-10T01:44:58.2139293Z A :attr:`parametrization` may optionally implement a method with signature 2025-10-10T01:44:58.2139950Z 2025-10-10T01:44:58.2140174Z .. code-block:: python 2025-10-10T01:44:58.2140480Z 2025-10-10T01:44:58.2140877Z def right_inverse(self, X: Tensor) -> Union[Tensor, Sequence[Tensor]] 2025-10-10T01:44:58.2141431Z 2025-10-10T01:44:58.2141896Z This method is called on the unparametrized tensor when the first parametrization 2025-10-10T01:44:58.2142893Z is registered to compute the initial value of the original tensor. 2025-10-10T01:44:58.2144299Z If this method is not implemented, the original tensor will be just the unparametrized tensor. 2025-10-10T01:44:58.2144992Z 2025-10-10T01:44:58.2145534Z If all the parametrizations registered on a tensor implement `right_inverse` it is possible 2025-10-10T01:44:58.2146713Z to initialize a parametrized tensor by assigning to it, as shown in the example below. 2025-10-10T01:44:58.2147385Z 2025-10-10T01:44:58.2147789Z It is possible for the first parametrization to depend on several inputs. 2025-10-10T01:44:58.2148755Z This may be implemented returning a tuple of tensors from ``right_inverse`` 2025-10-10T01:44:58.2149744Z (see the example implementation of a ``RankOne`` parametrization below). 2025-10-10T01:44:58.2150355Z 2025-10-10T01:44:58.2150918Z In this case, the unconstrained tensors are also located under ``module.parametrizations.weight`` 2025-10-10T01:44:58.2151892Z with names ``original0``, ``original1``,... 2025-10-10T01:44:58.2152314Z 2025-10-10T01:44:58.2152467Z .. note:: 2025-10-10T01:44:58.2152693Z 2025-10-10T01:44:58.2153145Z If unsafe=False (default) both the forward and right_inverse methods will be called 2025-10-10T01:44:58.2154046Z once to perform a number of consistency checks. 2025-10-10T01:44:58.2155072Z If unsafe=True, then right_inverse will be called if the tensor is not parametrized, 2025-10-10T01:44:58.2155915Z and nothing will be called otherwise. 2025-10-10T01:44:58.2156307Z 2025-10-10T01:44:58.2156458Z .. note:: 2025-10-10T01:44:58.2156679Z 2025-10-10T01:44:58.2157020Z In most situations, ``right_inverse`` will be a function such that 2025-10-10T01:44:58.2157754Z ``forward(right_inverse(X)) == X`` (see 2025-10-10T01:44:58.2158634Z `right inverse `_). 2025-10-10T01:44:58.2159717Z Sometimes, when the parametrization is not surjective, it may be reasonable 2025-10-10T01:44:58.2160500Z to relax this. 2025-10-10T01:44:58.2160771Z 2025-10-10T01:44:58.2160924Z .. warning:: 2025-10-10T01:44:58.2161168Z 2025-10-10T01:44:58.2161629Z If a parametrization depends on several inputs, :func:`~register_parametrization` 2025-10-10T01:44:58.2162693Z will register a number of new parameters. If such parametrization is registered 2025-10-10T01:44:58.2163764Z after the optimizer is created, these new parameters will need to be added manually 2025-10-10T01:44:58.2164744Z to the optimizer. See :meth:`torch.Optimizer.add_param_group`. 2025-10-10T01:44:58.2165262Z 2025-10-10T01:44:58.2165410Z Args: 2025-10-10T01:44:58.2165969Z module (nn.Module): module on which to register the parametrization 2025-10-10T01:44:58.2166872Z tensor_name (str): name of the parameter or buffer on which to register 2025-10-10T01:44:58.2167606Z the parametrization 2025-10-10T01:44:58.2168267Z parametrization (nn.Module): the parametrization to register 2025-10-10T01:44:58.2168953Z Keyword args: 2025-10-10T01:44:58.2169563Z unsafe (bool): a boolean flag that denotes whether the parametrization 2025-10-10T01:44:58.2170436Z may change the dtype and shape of the tensor. Default: `False` 2025-10-10T01:44:58.2171381Z Warning: the parametrization is not checked for consistency upon registration. 2025-10-10T01:44:58.2172209Z Enable this flag at your own risk. 2025-10-10T01:44:58.2172932Z 2025-10-10T01:44:58.2173091Z Raises: 2025-10-10T01:44:58.2173790Z ValueError: if the module does not have a parameter or a buffer named :attr:`tensor_name` 2025-10-10T01:44:58.2174460Z 2025-10-10T01:44:58.2174615Z Examples: 2025-10-10T01:44:58.2175095Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_LAPACK) 2025-10-10T01:44:58.2175691Z >>> import torch 2025-10-10T01:44:58.2176163Z >>> import torch.nn as nn 2025-10-10T01:44:58.2176734Z >>> import torch.nn.utils.parametrize as P 2025-10-10T01:44:58.2179275Z >>> 2025-10-10T01:44:58.2179681Z >>> class Symmetric(nn.Module): 2025-10-10T01:44:58.2180227Z >>> def forward(self, X): 2025-10-10T01:44:58.2180869Z >>> return X.triu() + X.triu(1).T # Return a symmetric matrix 2025-10-10T01:44:58.2181504Z >>> 2025-10-10T01:44:58.2181896Z >>> def right_inverse(self, A): 2025-10-10T01:44:58.2182430Z >>> return A.triu() 2025-10-10T01:44:58.2182904Z >>> 2025-10-10T01:44:58.2183278Z >>> m = nn.Linear(5, 5) 2025-10-10T01:44:58.2183880Z >>> P.register_parametrization(m, "weight", Symmetric()) 2025-10-10T01:44:58.2184779Z >>> print(torch.allclose(m.weight, m.weight.T)) # m.weight is now symmetric 2025-10-10T01:44:58.2185518Z True 2025-10-10T01:44:58.2185904Z >>> A = torch.rand(5, 5) 2025-10-10T01:44:58.2186410Z >>> A = A + A.T # A is now symmetric 2025-10-10T01:44:58.2187109Z >>> m.weight = A # Initialize the weight to be the symmetric matrix A 2025-10-10T01:44:58.2187860Z >>> print(torch.allclose(m.weight, A)) 2025-10-10T01:44:58.2188380Z True 2025-10-10T01:44:58.2188595Z 2025-10-10T01:44:58.2188791Z >>> class RankOne(nn.Module): 2025-10-10T01:44:58.2189310Z >>> def forward(self, x, y): 2025-10-10T01:44:58.2189910Z >>> # Form a rank 1 matrix multiplying two vectors 2025-10-10T01:44:58.2190579Z >>> return x.unsqueeze(-1) @ y.unsqueeze(-2) 2025-10-10T01:44:58.2191131Z >>> 2025-10-10T01:44:58.2191524Z >>> def right_inverse(self, Z): 2025-10-10T01:44:58.2192096Z >>> # Project Z onto the rank 1 matrices 2025-10-10T01:44:58.2192741Z >>> U, S, Vh = torch.linalg.svd(Z, full_matrices=False) 2025-10-10T01:44:58.2193392Z >>> # Return rescaled singular vectors 2025-10-10T01:44:58.2193972Z >>> s0_sqrt = S[0].sqrt().unsqueeze(-1) 2025-10-10T01:44:58.2194732Z >>> return U[..., :, 0] * s0_sqrt, Vh[..., 0, :] * s0_sqrt 2025-10-10T01:44:58.2195326Z >>> 2025-10-10T01:44:58.2195799Z >>> linear_rank_one = P.register_parametrization( 2025-10-10T01:44:58.2196449Z ... nn.Linear(4, 4), "weight", RankOne() 2025-10-10T01:44:58.2196991Z ... ) 2025-10-10T01:44:58.2197550Z >>> print(torch.linalg.matrix_rank(linear_rank_one.weight).item()) 2025-10-10T01:44:58.2198214Z 1 2025-10-10T01:44:58.2198428Z 2025-10-10T01:44:58.2198570Z 2025-10-10T01:44:58.2199653Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 3, 0, '_._ = None\n', 3, -1)) 2025-10-10T01:44:58.2200698Z 2025-10-10T01:44:58.2200846Z _._ = None 2025-10-10T01:44:58.2201197Z ^ 2025-10-10T01:44:58.3425077Z msg = Cannot scrape callname=DeviceMesh.__getitem__ in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py line=726. 2025-10-10T01:44:58.3426780Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:58.3427429Z 2025-10-10T01:44:58.3427903Z Slice the current DeviceMesh based on the mesh_dim_names given to create a submesh. 2025-10-10T01:44:58.3428996Z The submesh created consists of the dimensions and the communicators indicated by 2025-10-10T01:44:58.3429793Z ``mesh_dim_names`` 2025-10-10T01:44:58.3430058Z 2025-10-10T01:44:58.3430207Z Args: 2025-10-10T01:44:58.3431289Z mesh_dim_names (Union[str, Tuple[str]]): the name or the tuple of names of the 2025-10-10T01:44:58.3432190Z mesh dimension of the DeviceMesh to create the submesh for. 2025-10-10T01:44:58.3432827Z Returns: 2025-10-10T01:44:58.3433217Z A :class:`DeviceMesh` object 2025-10-10T01:44:58.3433549Z 2025-10-10T01:44:58.3434033Z The following program runs on each process/rank in an SPMD manner in a world size of 8. 2025-10-10T01:44:58.3435045Z In the first example: 2025-10-10T01:44:58.3435744Z Calling mesh_2d["tp"] on rank 0, 1, 2, 3 returns a 1D submesh of DeviceMesh:([0, 1, 2, 3]). 2025-10-10T01:44:58.3437086Z Calling mesh_2d["tp"] on rank 4, 5, 6, 7 returns a 1D submesh of DeviceMesh:([4, 5, 6, 7]). 2025-10-10T01:44:58.3438036Z Calling mesh_2d["dp"] on rank 0, 4 returns a 1D submesh of DeviceMesh:([0, 4]). 2025-10-10T01:44:58.3438940Z Calling mesh_2d["dp"] on rank 1, 5 returns a 1D submesh of DeviceMesh:([1, 5]). 2025-10-10T01:44:58.3439824Z Calling mesh_2d["dp"] on rank 2, 6 returns a 1D submesh of DeviceMesh:([2, 6]). 2025-10-10T01:44:58.3440737Z Calling mesh_2d["dp"] on rank 3, 7 returns a 1D submesh of DeviceMesh:([3, 7]). 2025-10-10T01:44:58.3441317Z 2025-10-10T01:44:58.3441492Z In the second example: 2025-10-10T01:44:58.3442212Z Calling mesh_3d["dp", "cp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 1], [4, 5]]). 2025-10-10T01:44:58.3443291Z Calling mesh_3d["dp", "cp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 3], [6, 7]]). 2025-10-10T01:44:58.3444318Z Calling mesh_3d["cp", "dp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 4], [1, 5]]). 2025-10-10T01:44:58.3445336Z Calling mesh_3d["cp", "dp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 6], [3, 7]]). 2025-10-10T01:44:58.3445934Z 2025-10-10T01:44:58.3446106Z Example:: 2025-10-10T01:44:58.3446328Z 2025-10-10T01:44:58.3446512Z >>> # xdoctest: +SKIP("no rank") 2025-10-10T01:44:58.3447157Z >>> from torch.distributed.device_mesh import DeviceMesh 2025-10-10T01:44:58.3447771Z >>> 2025-10-10T01:44:58.3448317Z >>> # Initialize a 2D device mesh as (2, 4) to represent the topology 2025-10-10T01:44:58.3449066Z >>> # of cross-host(dim 0), and within-host (dim 1). 2025-10-10T01:44:58.3449896Z >>> mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-10-10T01:44:58.3450684Z >>> tp_mesh = mesh_2d["tp"] 2025-10-10T01:44:58.3451162Z >>> dp_mesh = mesh_2d["dp"] 2025-10-10T01:44:58.3451590Z >>> 2025-10-10T01:44:58.3451949Z >>> # Initialize a 3D mesh. 2025-10-10T01:44:58.3452715Z >>> mesh_3d = init_device_mesh(device_type="cuda", (2,2,2), mesh_dim_names=("dp", "pp", "cp")) 2025-10-10T01:44:58.3453873Z >>> # The order of the mesh_dim_names provided deteremines the order of dimensions in the submesh. 2025-10-10T01:44:58.3454756Z >>> dp_cp_mesh = mesh_3d["dp", "cp"] 2025-10-10T01:44:58.3455307Z >>> cp_dp_mesh = mesh_3d["cp", "dp"] 2025-10-10T01:44:58.3455670Z 2025-10-10T01:44:58.3456773Z Original Error: SyntaxError('positional argument follows keyword argument', ('', 6, 82, 'mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp"))\n', 6, 83)) 2025-10-10T01:44:58.3458027Z 2025-10-10T01:44:58.3458437Z mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-10-10T01:44:58.3459247Z ^ 2025-10-10T01:44:58.8076540Z msg = Cannot scrape callname=FullStateDictConfig in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py line=295. 2025-10-10T01:44:58.8078309Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:58.8078941Z 2025-10-10T01:44:58.8079300Z ``FullStateDictConfig`` is a config class meant to be used with 2025-10-10T01:44:58.8080161Z ``StateDictType.FULL_STATE_DICT``. We recommend enabling both 2025-10-10T01:44:58.8081009Z ``offload_to_cpu=True`` and ``rank0_only=True`` when saving full state 2025-10-10T01:44:58.8082441Z dicts to save GPU memory and CPU memory, respectively. This config class 2025-10-10T01:44:58.8083363Z is meant to be used via the :func:`state_dict_type` context manager as 2025-10-10T01:44:58.8084033Z follows: 2025-10-10T01:44:58.8084254Z 2025-10-10T01:44:58.8084483Z >>> # xdoctest: +SKIP("undefined variables") 2025-10-10T01:44:58.8085295Z >>> from torch.distributed.fsdp import FullyShardedDataParallel as FSDP 2025-10-10T01:44:58.8086104Z >>> fsdp = FSDP(model, auto_wrap_policy=...) 2025-10-10T01:44:58.8087235Z >>> cfg = FullStateDictConfig(offload_to_cpu=True, rank0_only=True) 2025-10-10T01:44:58.8088153Z >>> with FSDP.state_dict_type(fsdp, StateDictType.FULL_STATE_DICT, cfg): 2025-10-10T01:44:58.8088910Z >>> state = fsdp.state_dict() 2025-10-10T01:44:58.8089601Z >>> # `state` will be empty on non rank 0 and contain CPU tensors on rank 0. 2025-10-10T01:44:58.8090519Z >>> # To reload checkpoint for inference, finetuning, transfer learning, etc: 2025-10-10T01:44:58.8091490Z >>> model = model_fn() # Initialize model in preparation for wrapping with FSDP 2025-10-10T01:44:58.8092280Z >>> if dist.get_rank() == 0: 2025-10-10T01:44:58.8092917Z >>> # Load checkpoint only on rank 0 to avoid memory redundancy 2025-10-10T01:44:58.8093665Z >>> state_dict = torch.load("my_checkpoint.pt") 2025-10-10T01:44:58.8094300Z >>> model.load_state_dict(state_dict) 2025-10-10T01:44:58.8095063Z >>> # All ranks initialize FSDP module as usual. `sync_module_states` argument 2025-10-10T01:44:58.8096044Z >>> # communicates loaded checkpoint states from rank 0 to rest of the world. 2025-10-10T01:44:58.8096766Z >>> fsdp = FSDP( 2025-10-10T01:44:58.8097169Z ... model, 2025-10-10T01:44:58.8097627Z ... device_id=torch.cuda.current_device(), 2025-10-10T01:44:58.8098203Z ... auto_wrap_policy=..., 2025-10-10T01:44:58.8098702Z ... sync_module_states=True, 2025-10-10T01:44:58.8099182Z ... ) 2025-10-10T01:44:58.8099746Z >>> # After this point, all ranks have FSDP model with loaded checkpoint. 2025-10-10T01:44:58.8100293Z 2025-10-10T01:44:58.8100450Z Attributes: 2025-10-10T01:44:58.8101011Z rank0_only (bool): If ``True``, then only rank 0 saves the full state 2025-10-10T01:44:58.8101873Z dict, and nonzero ranks save an empty dict. If ``False``, then all 2025-10-10T01:44:58.8102660Z ranks save the full state dict. (Default: ``False``) 2025-10-10T01:44:58.8103104Z 2025-10-10T01:44:58.8103908Z Original Error: IndentationError("expected an indented block after 'if' statement on line 10", ('', 11, 1, '_._ = None\n', 11, 2)) 2025-10-10T01:44:58.8104886Z 2025-10-10T01:44:58.8105031Z _._ = None 2025-10-10T01:44:58.8105377Z ^ 2025-10-10T01:44:59.1331309Z msg = Cannot scrape callname=SavePlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=122. 2025-10-10T01:44:59.1333209Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:59.1333832Z 2025-10-10T01:44:59.1334329Z Abstract class defining the protocol used by save_state_dict to plan the save process. 2025-10-10T01:44:59.1335000Z 2025-10-10T01:44:59.1335492Z SavePlanners are stateful objects that can be used to customize the whole save process. 2025-10-10T01:44:59.1336170Z 2025-10-10T01:44:59.1336659Z SavePlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-10-10T01:44:59.1337528Z will be visible to the whole process. 2025-10-10T01:44:59.1337909Z 2025-10-10T01:44:59.1338380Z A planner subclass can expect the following sequence of calls during save_state_dict: 2025-10-10T01:44:59.1339021Z 2025-10-10T01:44:59.1339239Z 1) set_up_planner - called on all ranks. 2025-10-10T01:44:59.1339844Z Signals the start of a checkpoint save. 2025-10-10T01:44:59.1340229Z 2025-10-10T01:44:59.1340447Z 2) create_local_plan - called on all ranks. 2025-10-10T01:44:59.1341799Z Process the state_dict and produces a `SavePlan` that will be sent for global planning. 2025-10-10T01:44:59.1342485Z 2025-10-10T01:44:59.1342813Z 3) create_global_plan - called on the coordinator rank only. 2025-10-10T01:44:59.1343649Z Takes the SavePlan from all ranks and make any global decision. 2025-10-10T01:44:59.1344184Z 2025-10-10T01:44:59.1344385Z 4) finish_plan - called on all ranks. 2025-10-10T01:44:59.1345115Z This gives each rank a chance to adjust to global planning decisions. 2025-10-10T01:44:59.1345672Z 2025-10-10T01:44:59.1346262Z 5) resolve_data - called multiple times on each rank 2025-10-10T01:44:59.1347053Z Lookups a value on the `state_dict` for the storage layer to write. 2025-10-10T01:44:59.1347610Z 2025-10-10T01:44:59.1348121Z Users are recommended to extend DefaultSavePlanner instead of this interface directly as 2025-10-10T01:44:59.1349178Z most changes can be expressed by changes in a single method. 2025-10-10T01:44:59.1349675Z 2025-10-10T01:44:59.1349890Z There are 3 usual patterns of extension: 2025-10-10T01:44:59.1350310Z 2025-10-10T01:44:59.1350739Z Rewriting state_dict. This is the simplest way to extend the save process as it 2025-10-10T01:44:59.1351723Z doesn't requite understanding the intrincacies of how SavePlan works: 2025-10-10T01:44:59.1352289Z 2025-10-10T01:44:59.1352488Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1353080Z >>> class RenamePlanner(DefaultSavePlanner): 2025-10-10T01:44:59.1353648Z >>> def set_up_planner( 2025-10-10T01:44:59.1354288Z >>> self, 2025-10-10T01:44:59.1354729Z >>> state_dict: STATE_DICT_TYPE, 2025-10-10T01:44:59.1355292Z >>> storage_meta: Optional[StorageMeta], 2025-10-10T01:44:59.1355857Z >>> is_coordinator: bool, 2025-10-10T01:44:59.1356340Z >>> ) -> None: 2025-10-10T01:44:59.1356763Z >>> # prefix all keys with `foo_`` 2025-10-10T01:44:59.1357579Z >>> super().set_up_planner({"foo_" + k: v for k, v in state_dict.items()}, storage_meta, is_coordinator) 2025-10-10T01:44:59.1358244Z 2025-10-10T01:44:59.1358807Z Modifying local plan and lookup in tandem. This is useful when fine control of how data is persisted 2025-10-10T01:44:59.1359544Z 2025-10-10T01:44:59.1359748Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1360322Z >>> class FP16Planner(DefaultSavePlanner): 2025-10-10T01:44:59.1360882Z >>> def create_local_plan(self): 2025-10-10T01:44:59.1361426Z >>> plan = super().create_local_plan() 2025-10-10T01:44:59.1361968Z >>> for p in plan: 2025-10-10T01:44:59.1362453Z >>> if p.tensor_data is not None: 2025-10-10T01:44:59.1363098Z >>> p.tensor_data.properties.dtype = torch.float16 2025-10-10T01:44:59.1363708Z >>> return plan 2025-10-10T01:44:59.1364109Z >>> 2025-10-10T01:44:59.1364494Z >>> def resolve_data(self, write_item): 2025-10-10T01:44:59.1365078Z >>> item = super().resolve_data(write_item) 2025-10-10T01:44:59.1365927Z >>> return item if write_item.type == WriteItemType.BYTE_IO else item.to(torch.float16) 2025-10-10T01:44:59.1366584Z 2025-10-10T01:44:59.1367148Z Using the global planning step to make central decisions that can't be made individually by each rank 2025-10-10T01:44:59.1367906Z 2025-10-10T01:44:59.1368105Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1368651Z >>> from itertools import zip_longest 2025-10-10T01:44:59.1369188Z >>> from dataclasses import replace 2025-10-10T01:44:59.1369818Z >>> class DDPLoadBalancingPlanner(DefaultSavePlanner): 2025-10-10T01:44:59.1370729Z >>> # This uses the default local plan behavior of having all non-sharded writes in rank 0 2025-10-10T01:44:59.1371599Z >>> # This sample doesn't handle ShardedTensors 2025-10-10T01:44:59.1372207Z >>> def create_global_plan(self, all_plans): 2025-10-10T01:44:59.1372857Z >>> iters = [iter(all_plans[0].items)] * len(all_plans) 2025-10-10T01:44:59.1373462Z >>> items_per_rank = [ 2025-10-10T01:44:59.1373992Z >>> [item for item in items if item is not None] 2025-10-10T01:44:59.1375014Z >>> for items in zip(*zip_longest(*iters), strict=True) 2025-10-10T01:44:59.1375624Z >>> ] 2025-10-10T01:44:59.1376010Z >>> all_plans = [ 2025-10-10T01:44:59.1376503Z >>> replace(plan, items=items) 2025-10-10T01:44:59.1377191Z >>> for plan, items in zip(all_plans, items_per_rank, strict=True) 2025-10-10T01:44:59.1377855Z >>> ] 2025-10-10T01:44:59.1378321Z >>> return super().create_global_plan(all_plans) 2025-10-10T01:44:59.1378755Z 2025-10-10T01:44:59.1379510Z Finally, some planners need to save additional metadata in the checkpoint, this is 2025-10-10T01:44:59.1380581Z accomplished by having each rank contribute their data items in the local plan and 2025-10-10T01:44:59.1381406Z the global planner aggregate them: 2025-10-10T01:44:59.1381769Z 2025-10-10T01:44:59.1381967Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1382604Z >>> class SaveExtraDataPlanner(DefaultSavePlanner): 2025-10-10T01:44:59.1383267Z >>> def create_local_plan(self) -> SavePlan: 2025-10-10T01:44:59.1383877Z >>> plan = super().create_local_plan() 2025-10-10T01:44:59.1384527Z >>> return replace(plan, planner_data="per-rank-data") 2025-10-10T01:44:59.1385131Z >>> 2025-10-10T01:44:59.1385822Z >>> def create_global_plan(self, all_plans: List[SavePlan]) -> Tuple[List[SavePlan], Metadata]: 2025-10-10T01:44:59.1386845Z >>> global_plan, metadata = super().create_global_plan(all_plans) 2025-10-10T01:44:59.1387635Z >>> merged_data = [p.planner_data for p in global_plan] 2025-10-10T01:44:59.1388385Z >>> metadata = replace(metadata, planner_data=merged_data) 2025-10-10T01:44:59.1389039Z >>> return global_plan, metadata 2025-10-10T01:44:59.1389413Z 2025-10-10T01:44:59.1390285Z Original Error: IndentationError('expected an indented block after function definition on line 3', ('', 9, 0, '_._ = None\n', 9, -1)) 2025-10-10T01:44:59.1391333Z 2025-10-10T01:44:59.1391482Z _._ = None 2025-10-10T01:44:59.1391829Z ^ 2025-10-10T01:44:59.1393078Z msg = Cannot scrape callname=LoadPlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=305. 2025-10-10T01:44:59.1394783Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:44:59.1395394Z 2025-10-10T01:44:59.1395874Z Abstract class defining the protocol used by load_state_dict to plan the load process. 2025-10-10T01:44:59.1396544Z 2025-10-10T01:44:59.1397021Z LoadPlanner are stateful objects that can be used to customize the whole load process. 2025-10-10T01:44:59.1397682Z 2025-10-10T01:44:59.1398152Z LoadPlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-10-10T01:44:59.1399004Z will be visible to the whole process. 2025-10-10T01:44:59.1399363Z 2025-10-10T01:44:59.1399827Z A planner subclass can expect the following sequence of calls during load_state_dict: 2025-10-10T01:44:59.1400468Z 2025-10-10T01:44:59.1400679Z 1) set_up_planner - called on all ranks. 2025-10-10T01:44:59.1401287Z Signals the start of loading a checkpoint. 2025-10-10T01:44:59.1401684Z 2025-10-10T01:44:59.1401907Z 2) create_local_plan - called on all ranks. 2025-10-10T01:44:59.1402769Z Process the state_dict and produces a `LoadPlan` that will be sent for global planning. 2025-10-10T01:44:59.1403429Z 2025-10-10T01:44:59.1403746Z 3) create_global_plan - called on the coordinator rank only. 2025-10-10T01:44:59.1404573Z Takes the LoadPlan from all ranks and make any global decision. 2025-10-10T01:44:59.1405091Z 2025-10-10T01:44:59.1405348Z 4) load_bytes - called multiple times on each rank 2025-10-10T01:44:59.1406062Z This is called once per non-tensor value in state_dict. 2025-10-10T01:44:59.1406518Z 2025-10-10T01:44:59.1406901Z 5) resolve_tensor and commit_tensor - called multiple times on each rank 2025-10-10T01:44:59.1407763Z They are called in pair for each Tensor value in state_dict. 2025-10-10T01:44:59.1408243Z 2025-10-10T01:44:59.1409078Z Users are recommended to extend DefaultLoadPlanner instead of this interface directly as 2025-10-10T01:44:59.1410088Z most changes can be expressed by changes in a single method. 2025-10-10T01:44:59.1410580Z 2025-10-10T01:44:59.1410802Z There are two usual patterns of extension: 2025-10-10T01:44:59.1411194Z 2025-10-10T01:44:59.1411620Z Rewriting state_dict. This is the simplest way to extend the load process as it 2025-10-10T01:44:59.1412670Z doesn't requite understanding the intrincacies of how LoadPlan works. We need 2025-10-10T01:44:59.1413954Z to keep a reference to the original state_dict as load happens in place so 2025-10-10T01:44:59.1414745Z we need to be able to perform it in place 2025-10-10T01:44:59.1415135Z 2025-10-10T01:44:59.1415333Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1415922Z >>> class RenamePlanner(DefaultLoadPlanner): 2025-10-10T01:44:59.1416498Z >>> def set_up_planner( 2025-10-10T01:44:59.1416948Z >>> self, 2025-10-10T01:44:59.1417376Z >>> state_dict: STATE_DICT_TYPE, 2025-10-10T01:44:59.1417922Z >>> metadata: Metadata, 2025-10-10T01:44:59.1418417Z >>> is_coordinator: bool, 2025-10-10T01:44:59.1418893Z >>> ) -> None: 2025-10-10T01:44:59.1419344Z >>> self.original_state_dict = state_dict 2025-10-10T01:44:59.1420032Z >>> state_dict = {"foo_" + k: v for k, v in state_dict.items()} 2025-10-10T01:44:59.1420651Z >>> 2025-10-10T01:44:59.1421050Z >>> if self.flatten_sharded_tensors: 2025-10-10T01:44:59.1421671Z >>> state_dict = _flatten_sharded_tensors(state_dict) 2025-10-10T01:44:59.1422268Z >>> 2025-10-10T01:44:59.1422640Z >>> if self.flatten_state_dict: 2025-10-10T01:44:59.1423304Z >>> state_dict, self.mappings = flatten_state_dict(state_dict) 2025-10-10T01:44:59.1423936Z >>> 2025-10-10T01:44:59.1424305Z >>> self.state_dict = state_dict 2025-10-10T01:44:59.1424838Z >>> self.metadata = metadata 2025-10-10T01:44:59.1425383Z >>> self.is_coordinator = is_coordinator 2025-10-10T01:44:59.1425924Z >>> 2025-10-10T01:44:59.1426322Z >>> def load_bytes(self, read_item, value): 2025-10-10T01:44:59.1426887Z >>> # Remove the "foo_" prefix 2025-10-10T01:44:59.1427728Z >>> self.original_state_dict[read_item.dest_index.fqn[4:]] = torch.load(value, weights_only=False) 2025-10-10T01:44:59.1428464Z 2025-10-10T01:44:59.1428472Z 2025-10-10T01:44:59.1428900Z Modifying resolve_tensor and commit_tensor to handle load time transformation. 2025-10-10T01:44:59.1429530Z 2025-10-10T01:44:59.1429739Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:44:59.1430354Z >>> class MetaModelMaterialize(DefaultSavePlanner): 2025-10-10T01:44:59.1431012Z >>> def resolve_tensor(self, read_item): 2025-10-10T01:44:59.1431604Z >>> tensor = super().resolve_tensor(read_item) 2025-10-10T01:44:59.1432268Z >>> return torch.empty_like(tensor, device="cpu") 2025-10-10T01:44:59.1432838Z >>> 2025-10-10T01:44:59.1433249Z >>> def commit_tensor(self, read_item, tensor): 2025-10-10T01:44:59.1433916Z >>> self.state_dict[read_item.dest_index.fqn] = tensor 2025-10-10T01:44:59.1434673Z 2025-10-10T01:44:59.1435546Z Original Error: IndentationError('expected an indented block after function definition on line 22', ('', 23, 0, '_._ = None\n', 23, -1)) 2025-10-10T01:44:59.1436596Z 2025-10-10T01:44:59.1436744Z _._ = None 2025-10-10T01:44:59.1437102Z ^ 2025-10-10T01:44:59.3874968Z running 877 test(s) 2025-10-10T01:44:59.3880223Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::typename:0, line 1095 <- wrt source file 2025-10-10T01:44:59.3889069Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::typename:0 2025-10-10T01:44:59.3890994Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_tensor:0, line 1131 <- wrt source file 2025-10-10T01:44:59.3894508Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_tensor:0 2025-10-10T01:44:59.3896506Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_storage:0, line 1146 <- wrt source file 2025-10-10T01:44:59.3898391Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_storage:0 2025-10-10T01:44:59.3900254Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_device:0, line 1224 <- wrt source file 2025-10-10T01:44:59.3902883Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_device:0 2025-10-10T01:44:59.3904828Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_tensor_type:0, line 1273 <- wrt source file 2025-10-10T01:44:59.3906812Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_tensor_type:0 2025-10-10T01:44:59.3908743Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_dtype:0, line 1310 <- wrt source file 2025-10-10T01:44:59.3910688Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_dtype:0 2025-10-10T01:44:59.3912622Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::use_deterministic_algorithms:0, line 1454 <- wrt source file 2025-10-10T01:44:59.3914884Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::use_deterministic_algorithms:0 2025-10-10T01:44:59.3916753Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::compile:0, line 2585 <- wrt source file 2025-10-10T01:44:59.3918515Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::compile:0 2025-10-10T01:44:59.3920451Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::_is_device_backend_autoload_enabled:0, line 2878 <- wrt source file 2025-10-10T01:44:59.3922583Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::_is_device_backend_autoload_enabled:0 2025-10-10T01:44:59.3924587Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_hook:0, line 679 <- wrt source file 2025-10-10T01:44:59.3949394Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_hook:0 2025-10-10T01:44:59.3951644Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_post_accumulate_grad_hook:0, line 736 <- wrt source file 2025-10-10T01:44:59.3971245Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_post_accumulate_grad_hook:0 2025-10-10T01:44:59.3973509Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.refine_names:0, line 1375 <- wrt source file 2025-10-10T01:44:59.4036101Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.refine_names:0 2025-10-10T01:44:59.4038153Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.align_to:0, line 1420 <- wrt source file 2025-10-10T01:44:59.4040219Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.align_to:0 2025-10-10T01:44:59.4042056Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.rename:0, line 1493 <- wrt source file 2025-10-10T01:44:59.4047758Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.rename:0 2025-10-10T01:44:59.4050372Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.to_sparse_coo:0, line 1523 <- wrt source file 2025-10-10T01:44:59.4052960Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.to_sparse_coo:0 2025-10-10T01:44:59.4054925Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.dim_order:0, line 1555 <- wrt source file 2025-10-10T01:44:59.4071093Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.dim_order:0 2025-10-10T01:44:59.4073859Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/quasirandom.py::SobolEngine:0, line 39 <- wrt source file 2025-10-10T01:44:59.4076002Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/quasirandom.py::SobolEngine:0 2025-10-10T01:44:59.4077957Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_tensors:0, line 64 <- wrt source file 2025-10-10T01:44:59.4079949Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_tensors:0 2025-10-10T01:44:59.4081926Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_shapes:0, line 92 <- wrt source file 2025-10-10T01:44:59.4083928Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_shapes:0 2025-10-10T01:44:59.4085777Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::split:0, line 144 <- wrt source file 2025-10-10T01:44:59.4097000Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::split:0 2025-10-10T01:44:59.4098931Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::einsum:0, line 258 <- wrt source file 2025-10-10T01:44:59.4115189Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::einsum:0 2025-10-10T01:44:59.4117165Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::meshgrid:0, line 450 <- wrt source file 2025-10-10T01:44:59.4155869Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::meshgrid:0 2025-10-10T01:44:59.4157884Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_impl:0, line 835 <- wrt source file 2025-10-10T01:44:59.4198064Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_impl:0 2025-10-10T01:44:59.4200216Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_consecutive_impl:0, line 992 <- wrt source file 2025-10-10T01:44:59.4210987Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_consecutive_impl:0 2025-10-10T01:44:59.4213078Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::tensordot:0, line 1267 <- wrt source file 2025-10-10T01:44:59.4221724Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::tensordot:0 2025-10-10T01:44:59.4223660Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cartesian_prod:0, line 1351 <- wrt source file 2025-10-10T01:44:59.4228434Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cartesian_prod:0 2025-10-10T01:44:59.4230330Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::block_diag:0, line 1385 <- wrt source file 2025-10-10T01:44:59.4239237Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::block_diag:0 2025-10-10T01:44:59.4241719Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cdist:0, line 1441 <- wrt source file 2025-10-10T01:44:59.4251548Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cdist:0 2025-10-10T01:44:59.4253483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_1d:0, line 1482 <- wrt source file 2025-10-10T01:44:59.4267251Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_1d:0 2025-10-10T01:44:59.4269163Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_2d:0, line 1520 <- wrt source file 2025-10-10T01:44:59.4284401Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_2d:0 2025-10-10T01:44:59.4286292Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_3d:0, line 1560 <- wrt source file 2025-10-10T01:44:59.4304788Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_3d:0 2025-10-10T01:44:59.4306590Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::norm:0, line 1735 <- wrt source file 2025-10-10T01:44:59.4336479Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::norm:0 2025-10-10T01:44:59.4338413Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::unravel_index:0, line 1905 <- wrt source file 2025-10-10T01:44:59.4365568Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::unravel_index:0 2025-10-10T01:44:59.4367464Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::chain_matmul:0, line 2005 <- wrt source file 2025-10-10T01:44:59.4369363Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::chain_matmul:0 2025-10-10T01:44:59.4371171Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_lu_impl:0, line 2106 <- wrt source file 2025-10-10T01:44:59.4373010Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_lu_impl:0 2025-10-10T01:44:59.4374964Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::Generator:0, line 15 <- wrt source file 2025-10-10T01:44:59.4377133Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::Generator:0 2025-10-10T01:44:59.4379241Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::_LinAlgError:0, line 5 <- wrt source file 2025-10-10T01:44:59.4381409Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::_LinAlgError:0 2025-10-10T01:44:59.4383482Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_namedtensor_internals.py::update_names:0, line 118 <- wrt source file 2025-10-10T01:44:59.4385536Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_namedtensor_internals.py::update_names:0 2025-10-10T01:44:59.4387482Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor_str.py::set_printoptions:0, line 53 <- wrt source file 2025-10-10T01:44:59.4389429Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor_str.py::set_printoptions:0 2025-10-10T01:44:59.4391322Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/torch_version.py::TorchVersion:0, line 19 <- wrt source file 2025-10-10T01:44:59.4393670Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/torch_version.py::TorchVersion:0 2025-10-10T01:44:59.4395767Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::add_safe_globals:0, line 300 <- wrt source file 2025-10-10T01:44:59.4397736Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::add_safe_globals:0 2025-10-10T01:44:59.4399650Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::safe_globals:0, line 325 <- wrt source file 2025-10-10T01:44:59.4401980Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::safe_globals:0 2025-10-10T01:44:59.4403821Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::skip_data:0, line 401 <- wrt source file 2025-10-10T01:44:59.4405714Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::skip_data:0 2025-10-10T01:44:59.4407653Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::register_package:0, line 473 <- wrt source file 2025-10-10T01:44:59.4409649Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::register_package:0 2025-10-10T01:44:59.4411486Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::save:0, line 960 <- wrt source file 2025-10-10T01:44:59.4413310Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::save:0 2025-10-10T01:44:59.4415077Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::load:0, line 1373 <- wrt source file 2025-10-10T01:44:59.4416906Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::load:0 2025-10-10T01:44:59.4418571Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::list:0, line 473 <- wrt source file 2025-10-10T01:44:59.4420254Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::list:0 2025-10-10T01:44:59.4421923Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::help:0, line 533 <- wrt source file 2025-10-10T01:44:59.4423545Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::help:0 2025-10-10T01:44:59.4425132Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load:0, line 624 <- wrt source file 2025-10-10T01:44:59.4426723Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load:0 2025-10-10T01:44:59.4428355Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::_load_local:0, line 672 <- wrt source file 2025-10-10T01:44:59.4430057Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::_load_local:0 2025-10-10T01:44:59.4431824Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::download_url_to_file:0, line 707 <- wrt source file 2025-10-10T01:44:59.4433677Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::download_url_to_file:0 2025-10-10T01:44:59.4435886Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load_state_dict_from_url:0, line 847 <- wrt source file 2025-10-10T01:44:59.4437803Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load_state_dict_from_url:0 2025-10-10T01:44:59.4439741Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_ignored_functions:0, line 116 <- wrt source file 2025-10-10T01:44:59.4442067Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_ignored_functions:0 2025-10-10T01:44:59.4444049Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_testing_overrides:0, line 424 <- wrt source file 2025-10-10T01:44:59.4450189Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_testing_overrides:0 2025-10-10T01:44:59.4452182Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::wrap_torch_function:0, line 1581 <- wrt source file 2025-10-10T01:44:59.4454548Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::wrap_torch_function:0 2025-10-10T01:44:59.4456480Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::handle_torch_function:0, line 1716 <- wrt source file 2025-10-10T01:44:59.4458479Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::handle_torch_function:0 2025-10-10T01:44:59.4460466Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_method_or_property:0, line 1964 <- wrt source file 2025-10-10T01:44:59.4487337Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_method_or_property:0 2025-10-10T01:44:59.4489452Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_like:0, line 1983 <- wrt source file 2025-10-10T01:44:59.4495455Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_like:0 2025-10-10T01:44:59.4497378Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.define:0, line 144 <- wrt source file 2025-10-10T01:44:59.4499754Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.define:0 2025-10-10T01:44:59.4501711Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library._impl_with_aoti_compile:0, line 238 <- wrt source file 2025-10-10T01:44:59.4510152Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library._impl_with_aoti_compile:0 2025-10-10T01:44:59.4512164Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.impl:0, line 299 <- wrt source file 2025-10-10T01:44:59.4514727Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.impl:0 2025-10-10T01:44:59.4516467Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::define:0, line 504 <- wrt source file 2025-10-10T01:44:59.5737505Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::define:0 2025-10-10T01:44:59.5739430Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::impl:0, line 610 <- wrt source file 2025-10-10T01:44:59.5753102Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::impl:0 2025-10-10T01:44:59.5755389Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_kernel:0, line 792 <- wrt source file 2025-10-10T01:44:59.5757392Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_kernel:0 2025-10-10T01:44:59.5759257Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autocast:0, line 861 <- wrt source file 2025-10-10T01:44:59.5761164Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autocast:0 2025-10-10T01:44:59.5763561Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autograd:0, line 1110 <- wrt source file 2025-10-10T01:44:59.6241517Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autograd:0 2025-10-10T01:44:59.6243689Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_torch_dispatch:0, line 1226 <- wrt source file 2025-10-10T01:44:59.6306581Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_torch_dispatch:0 2025-10-10T01:44:59.6309342Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_vmap:0, line 1315 <- wrt source file 2025-10-10T01:44:59.6445650Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_vmap:0 2025-10-10T01:44:59.6447721Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::opcheck:0, line 1640 <- wrt source file 2025-10-10T01:44:59.6449555Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::opcheck:0 2025-10-10T01:44:59.6451375Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::custom_op:0, line 55 <- wrt source file 2025-10-10T01:44:59.6453231Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::custom_op:0 2025-10-10T01:44:59.6455042Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl:0, line 138 <- wrt source file 2025-10-10T01:44:59.6456833Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl:0 2025-10-10T01:44:59.6458614Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl_abstract:0, line 208 <- wrt source file 2025-10-10T01:44:59.6539974Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl_abstract:0 2025-10-10T01:44:59.6542055Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py::_compile_kernel:0, line 1773 <- wrt source file 2025-10-10T01:44:59.6544067Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py::_compile_kernel:0 2025-10-10T01:44:59.6545972Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/__init__.py::annotate:0, line 147 <- wrt source file 2025-10-10T01:44:59.6547901Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/__init__.py::annotate:0 2025-10-10T01:44:59.6549772Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/mps/__init__.py::compile_shader:0, line 148 <- wrt source file 2025-10-10T01:44:59.6551697Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/mps/__init__.py::compile_shader:0 2025-10-10T01:44:59.6553698Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/monitor/__init__.py::TensorboardEventHandler:0, line 22 <- wrt source file 2025-10-10T01:44:59.6556530Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/monitor/__init__.py::TensorboardEventHandler:0 2025-10-10T01:44:59.6558606Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::allow_in_graph:0, line 128 <- wrt source file 2025-10-10T01:44:59.6560648Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::allow_in_graph:0 2025-10-10T01:44:59.6562654Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::substitute_in_graph:0, line 184 <- wrt source file 2025-10-10T01:44:59.8401837Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::substitute_in_graph:0 2025-10-10T01:44:59.8404096Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::wrap_numpy:0, line 414 <- wrt source file 2025-10-10T01:44:59.8406142Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::wrap_numpy:0 2025-10-10T01:44:59.8408064Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_compiling:0, line 446 <- wrt source file 2025-10-10T01:44:59.8410595Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_compiling:0 2025-10-10T01:44:59.8412625Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_dynamo_compiling:0, line 467 <- wrt source file 2025-10-10T01:44:59.8414723Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_dynamo_compiling:0 2025-10-10T01:44:59.8416685Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_exporting:0, line 485 <- wrt source file 2025-10-10T01:44:59.8418670Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_exporting:0 2025-10-10T01:44:59.8420661Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::save_cache_artifacts:0, line 500 <- wrt source file 2025-10-10T01:44:59.8422771Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::save_cache_artifacts:0 2025-10-10T01:44:59.8424819Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::load_cache_artifacts:0, line 515 <- wrt source file 2025-10-10T01:44:59.8426900Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::load_cache_artifacts:0 2025-10-10T01:44:59.8428794Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::save:0, line 349 <- wrt source file 2025-10-10T01:44:59.8430774Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::save:0 2025-10-10T01:44:59.8432749Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::load:0, line 419 <- wrt source file 2025-10-10T01:44:59.8434808Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::load:0 2025-10-10T01:44:59.8436699Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::register_dataclass:0, line 579 <- wrt source file 2025-10-10T01:44:59.8438799Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::register_dataclass:0 2025-10-10T01:44:59.8440789Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::as_nested_tensor:0, line 61 <- wrt source file 2025-10-10T01:44:59.8507172Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::as_nested_tensor:0 2025-10-10T01:44:59.8509225Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor:0, line 240 <- wrt source file 2025-10-10T01:44:59.8515484Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor:0 2025-10-10T01:44:59.8517428Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::narrow:0, line 315 <- wrt source file 2025-10-10T01:44:59.8566130Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::narrow:0 2025-10-10T01:44:59.8568615Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor_from_jagged:0, line 405 <- wrt source file 2025-10-10T01:44:59.8574418Z W1010 01:44:59.856000 10156 site-packages/torch/fx/_symbolic_trace.py:52] is_fx_tracing will return true for both fx.symbolic_trace and torch.export. Please use is_fx_tracing_symbolic_tracing() for specifically fx.symbolic_trace or torch.compiler.is_compiling() for specifically torch.export/compile. 2025-10-10T01:44:59.8591900Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor_from_jagged:0 2025-10-10T01:44:59.8593965Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::masked_select:0, line 481 <- wrt source file 2025-10-10T01:44:59.8612872Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::masked_select:0 2025-10-10T01:44:59.8614733Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::current_accelerator:0, line 113 <- wrt source file 2025-10-10T01:45:00.1349244Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::current_accelerator:0 2025-10-10T01:45:00.1351670Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::device_index:0, line 249 <- wrt source file 2025-10-10T01:45:00.1353920Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::device_index:0 2025-10-10T01:45:00.1356184Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.then:0, line 152 <- wrt source file 2025-10-10T01:45:00.1358300Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.then:0 2025-10-10T01:45:00.1360446Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.add_done_callback:0, line 201 <- wrt source file 2025-10-10T01:45:00.1362731Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.add_done_callback:0 2025-10-10T01:45:00.1364882Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_result:0, line 235 <- wrt source file 2025-10-10T01:45:00.1367118Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_result:0 2025-10-10T01:45:00.1369270Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_exception:0, line 265 <- wrt source file 2025-10-10T01:45:00.1371485Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_exception:0 2025-10-10T01:45:00.1373607Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::collect_all:0, line 299 <- wrt source file 2025-10-10T01:45:00.1375690Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::collect_all:0 2025-10-10T01:45:00.1377741Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_mode_options:0, line 331 <- wrt source file 2025-10-10T01:45:00.1379900Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_mode_options:0 2025-10-10T01:45:00.1381976Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_options:0, line 368 <- wrt source file 2025-10-10T01:45:00.1384050Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_options:0 2025-10-10T01:45:00.1386810Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims_common/__init__.py::compute_required_storage_length:0, line 1913 <- wrt source file 2025-10-10T01:45:00.1389385Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims_common/__init__.py::compute_required_storage_length:0 2025-10-10T01:45:00.1391522Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::sum:0, line 223 <- wrt source file 2025-10-10T01:45:00.1393787Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::sum:0 2025-10-10T01:45:00.1396029Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::check_sparse_tensor_invariants:0, line 475 <- wrt source file 2025-10-10T01:45:00.1400560Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::check_sparse_tensor_invariants:0 2025-10-10T01:45:00.1402871Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::as_sparse_gradcheck:0, line 561 <- wrt source file 2025-10-10T01:45:00.1457200Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::as_sparse_gradcheck:0 2025-10-10T01:45:00.1459044Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vjp:0, line 293 <- wrt source file 2025-10-10T01:45:00.1460819Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vjp:0 2025-10-10T01:45:00.1462510Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jvp:0, line 395 <- wrt source file 2025-10-10T01:45:00.1464242Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jvp:0 2025-10-10T01:45:00.1465936Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jacobian:0, line 630 <- wrt source file 2025-10-10T01:45:00.1468307Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jacobian:0 2025-10-10T01:45:00.1470011Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hessian:0, line 894 <- wrt source file 2025-10-10T01:45:00.1471765Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hessian:0 2025-10-10T01:45:00.1473426Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vhp:0, line 1010 <- wrt source file 2025-10-10T01:45:00.1476035Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vhp:0 2025-10-10T01:45:00.1478248Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hvp:0, line 1109 <- wrt source file 2025-10-10T01:45:00.1480612Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hvp:0 2025-10-10T01:45:00.1482874Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::profile:0, line 182 <- wrt source file 2025-10-10T01:45:00.1485084Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::profile:0 2025-10-10T01:45:00.1487336Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::record_function:0, line 747 <- wrt source file 2025-10-10T01:45:00.1489697Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::record_function:0 2025-10-10T01:45:00.1492521Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_itt:0, line 884 <- wrt source file 2025-10-10T01:45:00.1494686Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_itt:0 2025-10-10T01:45:00.1496681Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_nvtx:0, line 957 <- wrt source file 2025-10-10T01:45:00.1498737Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_nvtx:0 2025-10-10T01:45:00.1501084Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::no_grad:0, line 50 <- wrt source file 2025-10-10T01:45:00.1503100Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::no_grad:0 2025-10-10T01:45:00.1505112Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::enable_grad:0, line 108 <- wrt source file 2025-10-10T01:45:00.1507188Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::enable_grad:0 2025-10-10T01:45:00.1509249Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::set_grad_enabled:0, line 166 <- wrt source file 2025-10-10T01:45:00.1511408Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::set_grad_enabled:0 2025-10-10T01:45:00.1513530Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::inference_mode:0, line 252 <- wrt source file 2025-10-10T01:45:00.1516203Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::inference_mode:0 2025-10-10T01:45:00.1518263Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.name:0, line 59 <- wrt source file 2025-10-10T01:45:00.1520291Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.name:0 2025-10-10T01:45:00.1522179Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_hook:0, line 116 <- wrt source file 2025-10-10T01:45:00.1523954Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_hook:0 2025-10-10T01:45:00.1525741Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_prehook:0, line 153 <- wrt source file 2025-10-10T01:45:00.1535862Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_prehook:0 2025-10-10T01:45:00.1537725Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::saved_tensors_hooks:0, line 290 <- wrt source file 2025-10-10T01:45:00.1539581Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::saved_tensors_hooks:0 2025-10-10T01:45:00.1541328Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::save_on_cpu:0, line 360 <- wrt source file 2025-10-10T01:45:00.1543073Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::save_on_cpu:0 2025-10-10T01:45:00.1544888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::disable_saved_tensors_hooks:0, line 417 <- wrt source file 2025-10-10T01:45:00.1546824Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::disable_saved_tensors_hooks:0 2025-10-10T01:45:00.1549045Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::register_multi_grad_hook:0, line 494 <- wrt source file 2025-10-10T01:45:00.1553330Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::register_multi_grad_hook:0 2025-10-10T01:45:00.1555411Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::allow_mutation_on_saved_tensors:0, line 761 <- wrt source file 2025-10-10T01:45:00.1570641Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::allow_mutation_on_saved_tensors:0 2025-10-10T01:45:00.1573061Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_backward:0, line 72 <- wrt source file 2025-10-10T01:45:00.1575165Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_backward:0 2025-10-10T01:45:00.1577158Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_forward:0, line 116 <- wrt source file 2025-10-10T01:45:00.1579177Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_forward:0 2025-10-10T01:45:00.1581111Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_dirty:0, line 168 <- wrt source file 2025-10-10T01:45:00.1583060Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_dirty:0 2025-10-10T01:45:00.1585056Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_non_differentiable:0, line 215 <- wrt source file 2025-10-10T01:45:00.1587208Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_non_differentiable:0 2025-10-10T01:45:00.1589270Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.set_materialize_grads:0, line 244 <- wrt source file 2025-10-10T01:45:00.1591335Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.set_materialize_grads:0 2025-10-10T01:45:00.1593182Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::Function:0, line 486 <- wrt source file 2025-10-10T01:45:00.1595043Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::Function:0 2025-10-10T01:45:00.1596726Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::make_dual:0, line 82 <- wrt source file 2025-10-10T01:45:00.1598451Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::make_dual:0 2025-10-10T01:45:00.1600177Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::unpack_dual:0, line 151 <- wrt source file 2025-10-10T01:45:00.1601941Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::unpack_dual:0 2025-10-10T01:45:00.1603639Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::dual_level:0, line 187 <- wrt source file 2025-10-10T01:45:00.1605385Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::dual_level:0 2025-10-10T01:45:00.1607136Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/anomaly_mode.py::detect_anomaly:0, line 28 <- wrt source file 2025-10-10T01:45:00.1608974Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/anomaly_mode.py::detect_anomaly:0 2025-10-10T01:45:00.1611310Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::mark_subclass_constructor_exportable_experimental:0, line 194 <- wrt source file 2025-10-10T01:45:00.1613585Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::mark_subclass_constructor_exportable_experimental:0 2025-10-10T01:45:00.1615620Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::allow_in_pre_dispatch_graph:0, line 262 <- wrt source file 2025-10-10T01:45:00.1617876Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::allow_in_pre_dispatch_graph:0 2025-10-10T01:45:00.1619822Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/utils.py::register_module_as_pytree_input_node:0, line 1420 <- wrt source file 2025-10-10T01:45:00.1621910Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/utils.py::register_module_as_pytree_input_node:0 2025-10-10T01:45:00.1623783Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:0, line 114 <- wrt source file 2025-10-10T01:45:00.1625551Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:0 2025-10-10T01:45:00.1627236Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:1, line 125 <- wrt source file 2025-10-10T01:45:00.1628944Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:1 2025-10-10T01:45:00.1630615Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:2, line 140 <- wrt source file 2025-10-10T01:45:00.1632324Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:2 2025-10-10T01:45:00.1634380Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_multi_output_jit_fn:0, line 173 <- wrt source file 2025-10-10T01:45:00.1636342Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_multi_output_jit_fn:0 2025-10-10T01:45:00.1638109Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_register_buffer:0, line 43 <- wrt source file 2025-10-10T01:45:00.1639813Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_register_buffer:0 2025-10-10T01:45:00.1641500Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_deregister_buffer:0, line 59 <- wrt source file 2025-10-10T01:45:00.1643227Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_deregister_buffer:0 2025-10-10T01:45:00.1644812Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::GdsFile:0, line 86 <- wrt source file 2025-10-10T01:45:00.1646366Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::GdsFile:0 2025-10-10T01:45:00.1647934Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/profiler.py::profile:0, line 75 <- wrt source file 2025-10-10T01:45:00.1649579Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/profiler.py::profile:0 2025-10-10T01:45:00.1651346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/_check.py::AttributeTypeIsSupportedChecker:0, line 36 <- wrt source file 2025-10-10T01:45:00.1653664Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/_check.py::AttributeTypeIsSupportedChecker:0 2025-10-10T01:45:00.1655589Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_load_for_lite_interpreter:0, line 22 <- wrt source file 2025-10-10T01:45:00.1657549Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_load_for_lite_interpreter:0 2025-10-10T01:45:00.1659505Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_mobile_model_contained_types:0, line 127 <- wrt source file 2025-10-10T01:45:00.1661832Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_mobile_model_contained_types:0 2025-10-10T01:45:00.1663748Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_model_ops_and_info:0, line 231 <- wrt source file 2025-10-10T01:45:00.1665652Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_model_ops_and_info:0 2025-10-10T01:45:00.1667432Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims/context.py::TorchRefsMode:0, line 95 <- wrt source file 2025-10-10T01:45:00.1669152Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims/context.py::TorchRefsMode:0 2025-10-10T01:45:00.1671160Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/partitioner.py::HopPartitionedGraph._reorder_fw_output:0, line 133 <- wrt source file 2025-10-10T01:45:00.1673460Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/partitioner.py::HopPartitionedGraph._reorder_fw_output:0 2025-10-10T01:45:00.1675490Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/cond.py::cond:0, line 139 <- wrt source file 2025-10-10T01:45:00.1677195Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/cond.py::cond:0 2025-10-10T01:45:00.1679021Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::associative_scan:0, line 183 <- wrt source file 2025-10-10T01:45:00.1681051Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::associative_scan:0 2025-10-10T01:45:00.1683132Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::generic_associative_scan:0, line 319 <- wrt source file 2025-10-10T01:45:00.1685263Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::generic_associative_scan:0 2025-10-10T01:45:00.1687112Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/map.py::map:0, line 80 <- wrt source file 2025-10-10T01:45:00.1688768Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/map.py::map:0 2025-10-10T01:45:00.1690432Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/scan.py::scan:0, line 130 <- wrt source file 2025-10-10T01:45:00.1692139Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/scan.py::scan:0 2025-10-10T01:45:00.1693957Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flat_apply.py::FlatApply.__call__:0, line 80 <- wrt source file 2025-10-10T01:45:00.1695899Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flat_apply.py::FlatApply.__call__:0 2025-10-10T01:45:00.1697992Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/amp/grad_scaler.py::GradScaler:0, line 64 <- wrt source file 2025-10-10T01:45:00.1699732Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/amp/grad_scaler.py::GradScaler:0 2025-10-10T01:45:00.1701532Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_impl.py::FakeImplCtx.new_dynamic_size:0, line 175 <- wrt source file 2025-10-10T01:45:00.1978832Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_impl.py::FakeImplCtx.new_dynamic_size:0 2025-10-10T01:45:00.1981102Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::triton_op:0, line 136 <- wrt source file 2025-10-10T01:45:00.1982923Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::triton_op:0 2025-10-10T01:45:00.1984942Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::wrap_triton:0, line 307 <- wrt source file 2025-10-10T01:45:00.1987001Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::wrap_triton:0 2025-10-10T01:45:00.1988977Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::custom_op:0, line 100 <- wrt source file 2025-10-10T01:45:00.2263452Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::custom_op:0 2025-10-10T01:45:00.2265757Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.set_kernel_enabled:0, line 240 <- wrt source file 2025-10-10T01:45:00.2340461Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.set_kernel_enabled:0 2025-10-10T01:45:00.2342888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_kernel:0, line 309 <- wrt source file 2025-10-10T01:45:00.2345279Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_kernel:0 2025-10-10T01:45:00.2347618Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autograd:0, line 545 <- wrt source file 2025-10-10T01:45:00.2491440Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autograd:0 2025-10-10T01:45:00.2493817Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_vmap:0, line 720 <- wrt source file 2025-10-10T01:45:00.2641259Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_vmap:0 2025-10-10T01:45:00.2643672Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autocast:0, line 806 <- wrt source file 2025-10-10T01:45:00.2646142Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autocast:0 2025-10-10T01:45:00.2648536Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_class_registry.py::register_fake_class:0, line 240 <- wrt source file 2025-10-10T01:45:00.2650945Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_class_registry.py::register_fake_class:0 2025-10-10T01:45:00.2653161Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/opaque_object.py::make_opaque:0, line 43 <- wrt source file 2025-10-10T01:45:00.2659221Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/opaque_object.py::make_opaque:0 2025-10-10T01:45:00.2661448Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/infer_schema.py::infer_schema:0, line 53 <- wrt source file 2025-10-10T01:45:00.2666302Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/infer_schema.py::infer_schema:0 2025-10-10T01:45:00.2668511Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_unlift.py::_convert_guards_code_to_fn:0, line 158 <- wrt source file 2025-10-10T01:45:00.2671267Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_unlift.py::_convert_guards_code_to_fn:0 2025-10-10T01:45:00.2673389Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::Dim:0, line 121 <- wrt source file 2025-10-10T01:45:00.2675610Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::Dim:0 2025-10-10T01:45:00.2677725Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:0, line 733 <- wrt source file 2025-10-10T01:45:00.2680007Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:0 2025-10-10T01:45:00.2682256Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:1, line 749 <- wrt source file 2025-10-10T01:45:00.2684610Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:1 2025-10-10T01:45:00.2686883Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::AdditionalInputs:0, line 833 <- wrt source file 2025-10-10T01:45:00.2689173Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::AdditionalInputs:0 2025-10-10T01:45:00.2691352Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/subgraph_rewriter.py::replace_pattern:0, line 125 <- wrt source file 2025-10-10T01:45:00.2693549Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/subgraph_rewriter.py::replace_pattern:0 2025-10-10T01:45:00.2695556Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::_snake_case:0, line 102 <- wrt source file 2025-10-10T01:45:00.2697476Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::_snake_case:0 2025-10-10T01:45:00.2699499Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.eliminate_dead_code:0, line 1916 <- wrt source file 2025-10-10T01:45:00.2701680Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.eliminate_dead_code:0 2025-10-10T01:45:00.2703748Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.on_generate_code:0, line 2010 <- wrt source file 2025-10-10T01:45:00.2705844Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.on_generate_code:0 2025-10-10T01:45:00.2707864Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::TensorType:0, line 12 <- wrt source file 2025-10-10T01:45:00.2709876Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::TensorType:0 2025-10-10T01:45:00.2711822Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_consistent:0, line 65 <- wrt source file 2025-10-10T01:45:00.2713844Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_consistent:0 2025-10-10T01:45:00.2716466Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_more_precise:0, line 93 <- wrt source file 2025-10-10T01:45:00.2718560Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_more_precise:0 2025-10-10T01:45:00.2720565Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Interpreter:0, line 49 <- wrt source file 2025-10-10T01:45:00.2722927Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Interpreter:0 2025-10-10T01:45:00.2724905Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Transformer:0, line 480 <- wrt source file 2025-10-10T01:45:00.2726944Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Transformer:0 2025-10-10T01:45:00.2729202Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/rewriter.py::AST_Rewriter.visit_AnnAssign:0, line 96 <- wrt source file 2025-10-10T01:45:00.2731772Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/rewriter.py::AST_Rewriter.visit_AnnAssign:0 2025-10-10T01:45:00.2734170Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unifiable:0, line 19 <- wrt source file 2025-10-10T01:45:00.2736575Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unifiable:0 2025-10-10T01:45:00.2738883Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::reify_object:0, line 45 <- wrt source file 2025-10-10T01:45:00.2741301Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::reify_object:0 2025-10-10T01:45:00.2743672Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unify_object:0, line 101 <- wrt source file 2025-10-10T01:45:00.2746105Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unify_object:0 2025-10-10T01:45:00.2748506Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge:0, line 37 <- wrt source file 2025-10-10T01:45:00.2751048Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge:0 2025-10-10T01:45:00.2753547Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge_with:0, line 64 <- wrt source file 2025-10-10T01:45:00.2756503Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge_with:0 2025-10-10T01:45:00.2759026Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valmap:0, line 90 <- wrt source file 2025-10-10T01:45:00.2761567Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valmap:0 2025-10-10T01:45:00.2764057Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keymap:0, line 106 <- wrt source file 2025-10-10T01:45:00.2766597Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keymap:0 2025-10-10T01:45:00.2769521Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemmap:0, line 122 <- wrt source file 2025-10-10T01:45:00.2772119Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemmap:0 2025-10-10T01:45:00.2774667Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valfilter:0, line 138 <- wrt source file 2025-10-10T01:45:00.2777277Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valfilter:0 2025-10-10T01:45:00.2782950Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keyfilter:0, line 158 <- wrt source file 2025-10-10T01:45:00.2785583Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keyfilter:0 2025-10-10T01:45:00.2788194Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemfilter:0, line 178 <- wrt source file 2025-10-10T01:45:00.2790828Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemfilter:0 2025-10-10T01:45:00.2793343Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc:0, line 204 <- wrt source file 2025-10-10T01:45:00.2796066Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc:0 2025-10-10T01:45:00.2798904Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::dissoc:0, line 221 <- wrt source file 2025-10-10T01:45:00.2801455Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::dissoc:0 2025-10-10T01:45:00.2803943Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc_in:0, line 247 <- wrt source file 2025-10-10T01:45:00.2806547Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc_in:0 2025-10-10T01:45:00.2809072Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::update_in:0, line 275 <- wrt source file 2025-10-10T01:45:00.2811661Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::update_in:0 2025-10-10T01:45:00.2814267Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::get_in:0, line 329 <- wrt source file 2025-10-10T01:45:00.2816961Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::get_in:0 2025-10-10T01:45:00.2819446Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::groupby:0, line 376 <- wrt source file 2025-10-10T01:45:00.2822029Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::groupby:0 2025-10-10T01:45:00.2824535Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::first:0, line 417 <- wrt source file 2025-10-10T01:45:00.2827080Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::first:0 2025-10-10T01:45:00.2829848Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/variable.py::variables:0, line 67 <- wrt source file 2025-10-10T01:45:00.2832341Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/variable.py::variables:0 2025-10-10T01:45:00.2834855Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::transitive_get:0, line 15 <- wrt source file 2025-10-10T01:45:00.2837696Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::transitive_get:0 2025-10-10T01:45:00.2840141Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::_toposort:0, line 42 <- wrt source file 2025-10-10T01:45:00.2842803Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::_toposort:0 2025-10-10T01:45:00.2845156Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::reverse_dict:0, line 70 <- wrt source file 2025-10-10T01:45:00.2847602Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::reverse_dict:0 2025-10-10T01:45:00.2849915Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::freeze:0, line 95 <- wrt source file 2025-10-10T01:45:00.2852257Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::freeze:0 2025-10-10T01:45:00.2854640Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/core.py::reify:0, line 58 <- wrt source file 2025-10-10T01:45:00.2857079Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/core.py::reify:0 2025-10-10T01:45:00.2859407Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/match.py::VarDispatcher:0, line 48 <- wrt source file 2025-10-10T01:45:00.2861868Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/match.py::VarDispatcher:0 2025-10-10T01:45:00.2864459Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::expand_tuples:0, line 18 <- wrt source file 2025-10-10T01:45:00.2867274Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::expand_tuples:0 2025-10-10T01:45:00.2869964Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::_toposort:0, line 41 <- wrt source file 2025-10-10T01:45:00.2872691Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::_toposort:0 2025-10-10T01:45:00.2875757Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::reverse_dict:0, line 68 <- wrt source file 2025-10-10T01:45:00.2878924Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::reverse_dict:0 2025-10-10T01:45:00.2881629Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::groupby:0, line 87 <- wrt source file 2025-10-10T01:45:00.2884302Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::groupby:0 2025-10-10T01:45:00.2886811Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::typename:0, line 117 <- wrt source file 2025-10-10T01:45:00.2889107Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::typename:0 2025-10-10T01:45:00.2891300Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/core.py::dispatch:0, line 27 <- wrt source file 2025-10-10T01:45:00.2893847Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/core.py::dispatch:0 2025-10-10T01:45:00.2896393Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher:0, line 113 <- wrt source file 2025-10-10T01:45:00.2898795Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher:0 2025-10-10T01:45:00.2901237Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.register:0, line 138 <- wrt source file 2025-10-10T01:45:00.2903775Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.register:0 2025-10-10T01:45:00.2906241Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.add:0, line 191 <- wrt source file 2025-10-10T01:45:00.2908709Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.add:0 2025-10-10T01:45:00.2911170Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.dispatch:0, line 305 <- wrt source file 2025-10-10T01:45:00.2913735Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.dispatch:0 2025-10-10T01:45:00.2916272Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::str_signature:0, line 436 <- wrt source file 2025-10-10T01:45:00.2919050Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::str_signature:0 2025-10-10T01:45:00.2921372Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::isvariadic:0, line 47 <- wrt source file 2025-10-10T01:45:00.2923740Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::isvariadic:0 2025-10-10T01:45:00.2925997Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::Variadic:0, line 83 <- wrt source file 2025-10-10T01:45:00.2928339Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::Variadic:0 2025-10-10T01:45:00.2930314Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/shape_prop.py::ShapeProp:0, line 99 <- wrt source file 2025-10-10T01:45:00.2932064Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/shape_prop.py::ShapeProp:0 2025-10-10T01:45:00.2934291Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/graph_drawer.py::FxGraphDrawer.get_dot_graph:0, line 129 <- wrt source file 2025-10-10T01:45:00.2936596Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/graph_drawer.py::FxGraphDrawer.get_dot_graph:0 2025-10-10T01:45:00.2938478Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/split_module.py::split_module:0, line 93 <- wrt source file 2025-10-10T01:45:00.2940295Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/split_module.py::split_module:0 2025-10-10T01:45:00.2942786Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/utils/matcher_with_name_node_map_utils.py::SubgraphMatcherWithNameNodeMap:0, line 51 <- wrt source file 2025-10-10T01:45:00.2945279Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/utils/matcher_with_name_node_map_utils.py::SubgraphMatcherWithNameNodeMap:0 2025-10-10T01:45:00.2947330Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:0, line 153 <- wrt source file 2025-10-10T01:45:00.2949047Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:0 2025-10-10T01:45:00.2950730Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:1, line 179 <- wrt source file 2025-10-10T01:45:00.2952450Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:1 2025-10-10T01:45:00.2954177Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::update_bn:0, line 343 <- wrt source file 2025-10-10T01:45:00.2955855Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::update_bn:0 2025-10-10T01:45:00.2957654Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::SWALR:0, line 402 <- wrt source file 2025-10-10T01:45:00.2959405Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::SWALR:0 2025-10-10T01:45:00.2961024Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LambdaLR:0, line 364 <- wrt source file 2025-10-10T01:45:00.2962721Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LambdaLR:0 2025-10-10T01:45:00.2964454Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiplicativeLR:0, line 490 <- wrt source file 2025-10-10T01:45:00.2966273Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiplicativeLR:0 2025-10-10T01:45:00.2967976Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::StepLR:0, line 613 <- wrt source file 2025-10-10T01:45:00.2969633Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::StepLR:0 2025-10-10T01:45:00.2971306Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiStepLR:0, line 700 <- wrt source file 2025-10-10T01:45:00.2973047Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiStepLR:0 2025-10-10T01:45:00.2974868Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ConstantLR:0, line 796 <- wrt source file 2025-10-10T01:45:00.2976715Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ConstantLR:0 2025-10-10T01:45:00.2978663Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LinearLR:0, line 903 <- wrt source file 2025-10-10T01:45:00.2980376Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LinearLR:0 2025-10-10T01:45:00.2982100Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ExponentialLR:0, line 1025 <- wrt source file 2025-10-10T01:45:00.2983896Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ExponentialLR:0 2025-10-10T01:45:00.2985898Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::SequentialLR:0, line 1102 <- wrt source file 2025-10-10T01:45:00.2987662Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::SequentialLR:0 2025-10-10T01:45:00.2989395Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::PolynomialLR:0, line 1254 <- wrt source file 2025-10-10T01:45:00.2991145Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::PolynomialLR:0 2025-10-10T01:45:00.2992914Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingLR:0, line 1383 <- wrt source file 2025-10-10T01:45:00.2995062Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingLR:0 2025-10-10T01:45:00.2996884Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ChainedScheduler:0, line 1491 <- wrt source file 2025-10-10T01:45:00.2998731Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ChainedScheduler:0 2025-10-10T01:45:00.3000459Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CyclicLR:0, line 1864 <- wrt source file 2025-10-10T01:45:00.3002157Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CyclicLR:0 2025-10-10T01:45:00.3003989Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts:0, line 2125 <- wrt source file 2025-10-10T01:45:00.3006006Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts:0 2025-10-10T01:45:00.3008030Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:0, line 2207 <- wrt source file 2025-10-10T01:45:00.3010182Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:0 2025-10-10T01:45:00.3012274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:1, line 2223 <- wrt source file 2025-10-10T01:45:00.3014369Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:1 2025-10-10T01:45:00.3016246Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::OneCycleLR:0, line 2361 <- wrt source file 2025-10-10T01:45:00.3017998Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::OneCycleLR:0 2025-10-10T01:45:00.3019813Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/optimizer.py::Optimizer.load_state_dict:0, line 899 <- wrt source file 2025-10-10T01:45:00.3021735Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/optimizer.py::Optimizer.load_state_dict:0 2025-10-10T01:45:00.3023800Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/_ops.py::logaddexp:0, line 1539 <- wrt source file 2025-10-10T01:45:00.3025574Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/_ops.py::logaddexp:0 2025-10-10T01:45:00.3027349Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py::is_masked_tensor:0, line 25 <- wrt source file 2025-10-10T01:45:00.3029348Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py::is_masked_tensor:0 2025-10-10T01:45:00.3031421Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_creation.py::make_tensor:0, line 114 <- wrt source file 2025-10-10T01:45:00.3033145Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_creation.py::make_tensor:0 2025-10-10T01:45:00.3034985Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_comparison.py::assert_close:0, line 1475 <- wrt source file 2025-10-10T01:45:00.3036826Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_comparison.py::assert_close:0 2025-10-10T01:45:00.3038651Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::parametrize:0, line 644 <- wrt source file 2025-10-10T01:45:00.3040596Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::parametrize:0 2025-10-10T01:45:00.3042517Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::reparametrize:0, line 765 <- wrt source file 2025-10-10T01:45:00.3044475Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::reparametrize:0 2025-10-10T01:45:00.3046371Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::decorateIf:0, line 854 <- wrt source file 2025-10-10T01:45:00.3048287Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::decorateIf:0 2025-10-10T01:45:00.3050282Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_symmetric_psd_matrix:0, line 4788 <- wrt source file 2025-10-10T01:45:00.3052443Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_symmetric_psd_matrix:0 2025-10-10T01:45:00.3054519Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_psd_matrix:0, line 4802 <- wrt source file 2025-10-10T01:45:00.3056657Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_psd_matrix:0 2025-10-10T01:45:00.3058733Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_pd_matrix:0, line 4832 <- wrt source file 2025-10-10T01:45:00.3060844Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_pd_matrix:0 2025-10-10T01:45:00.3062827Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::logs_to_string:0, line 194 <- wrt source file 2025-10-10T01:45:00.3064780Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::logs_to_string:0 2025-10-10T01:45:00.3067104Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::multiple_logs_to_string:0, line 220 <- wrt source file 2025-10-10T01:45:00.3069227Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::multiple_logs_to_string:0 2025-10-10T01:45:00.3071442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/autograd_registration.py::autograd_registration_check:0, line 29 <- wrt source file 2025-10-10T01:45:00.3073848Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/autograd_registration.py::autograd_registration_check:0 2025-10-10T01:45:00.3076858Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/distributed/_tensor/common_dtensor.py::skip_unless_torch_gpu:0, line 327 <- wrt source file 2025-10-10T01:45:00.3079307Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/distributed/_tensor/common_dtensor.py::skip_unless_torch_gpu:0 2025-10-10T01:45:00.3081372Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::register_pytree_node:0, line 155 <- wrt source file 2025-10-10T01:45:00.3083232Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::register_pytree_node:0 2025-10-10T01:45:00.3084990Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_is_leaf:0, line 276 <- wrt source file 2025-10-10T01:45:00.3086720Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_is_leaf:0 2025-10-10T01:45:00.3088403Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_flatten:0, line 319 <- wrt source file 2025-10-10T01:45:00.3090131Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_flatten:0 2025-10-10T01:45:00.3091834Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_unflatten:0, line 356 <- wrt source file 2025-10-10T01:45:00.3093614Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_unflatten:0 2025-10-10T01:45:00.3095288Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_iter:0, line 386 <- wrt source file 2025-10-10T01:45:00.3096975Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_iter:0 2025-10-10T01:45:00.3098633Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_leaves:0, line 421 <- wrt source file 2025-10-10T01:45:00.3100334Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_leaves:0 2025-10-10T01:45:00.3102024Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_structure:0, line 456 <- wrt source file 2025-10-10T01:45:00.3103774Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_structure:0 2025-10-10T01:45:00.3105437Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_map:0, line 493 <- wrt source file 2025-10-10T01:45:00.3107108Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_map:0 2025-10-10T01:45:00.3108794Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::broadcast_prefix:0, line 888 <- wrt source file 2025-10-10T01:45:00.3110595Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::broadcast_prefix:0 2025-10-10T01:45:00.3112639Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_dataclass:0, line 301 <- wrt source file 2025-10-10T01:45:00.3114505Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_dataclass:0 2025-10-10T01:45:00.3116230Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_constant:0, line 417 <- wrt source file 2025-10-10T01:45:00.3118347Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_constant:0 2025-10-10T01:45:00.3120016Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_is_leaf:0, line 1032 <- wrt source file 2025-10-10T01:45:00.3121683Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_is_leaf:0 2025-10-10T01:45:00.3123296Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_map:0, line 1351 <- wrt source file 2025-10-10T01:45:00.3124944Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_map:0 2025-10-10T01:45:00.3126631Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CppExtension:0, line 1190 <- wrt source file 2025-10-10T01:45:00.3128417Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CppExtension:0 2025-10-10T01:45:00.3130198Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:0, line 1262 <- wrt source file 2025-10-10T01:45:00.3132004Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:0 2025-10-10T01:45:00.3133782Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:1, line 1340 <- wrt source file 2025-10-10T01:45:00.3135596Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:1 2025-10-10T01:45:00.3137368Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::SyclExtension:0, line 1452 <- wrt source file 2025-10-10T01:45:00.3139155Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::SyclExtension:0 2025-10-10T01:45:00.3140851Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load:0, line 1701 <- wrt source file 2025-10-10T01:45:00.3142521Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load:0 2025-10-10T01:45:00.3144215Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load_inline:0, line 1977 <- wrt source file 2025-10-10T01:45:00.3145977Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load_inline:0 2025-10-10T01:45:00.3147647Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/dlpack.py::from_dlpack:0, line 93 <- wrt source file 2025-10-10T01:45:00.3149325Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/dlpack.py::from_dlpack:0 2025-10-10T01:45:00.3151174Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::rename_privateuse1_backend:0, line 72 <- wrt source file 2025-10-10T01:45:00.3153261Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::rename_privateuse1_backend:0 2025-10-10T01:45:00.3155803Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::generate_methods_for_privateuse1_backend:0, line 379 <- wrt source file 2025-10-10T01:45:00.3158088Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::generate_methods_for_privateuse1_backend:0 2025-10-10T01:45:00.3160157Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::_get_custom_mod_func:0, line 414 <- wrt source file 2025-10-10T01:45:00.3162436Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::_get_custom_mod_func:0 2025-10-10T01:45:00.3164347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::checkpoint_sequential:0, line 556 <- wrt source file 2025-10-10T01:45:00.3166238Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::checkpoint_sequential:0 2025-10-10T01:45:00.3168093Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::set_checkpoint_early_stop:0, line 758 <- wrt source file 2025-10-10T01:45:00.3170011Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::set_checkpoint_early_stop:0 2025-10-10T01:45:00.3171941Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::SelectiveCheckpointContext:0, line 1237 <- wrt source file 2025-10-10T01:45:00.3173947Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::SelectiveCheckpointContext:0 2025-10-10T01:45:00.3175958Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::create_selective_checkpoint_contexts:0, line 1393 <- wrt source file 2025-10-10T01:45:00.3178067Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::create_selective_checkpoint_contexts:0 2025-10-10T01:45:00.3180098Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/throughput_benchmark.py::ThroughputBenchmark:0, line 77 <- wrt source file 2025-10-10T01:45:00.3182125Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/throughput_benchmark.py::ThroughputBenchmark:0 2025-10-10T01:45:00.3184129Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_sympy/functions.py::MinMaxBase._collapse_arguments:0, line 744 <- wrt source file 2025-10-10T01:45:00.3579694Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_sympy/functions.py::MinMaxBase._collapse_arguments:0 2025-10-10T01:45:00.3581698Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.__init__:0, line 216 <- wrt source file 2025-10-10T01:45:00.3583689Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.__init__:0 2025-10-10T01:45:00.3585590Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_hparams:0, line 320 <- wrt source file 2025-10-10T01:45:00.3587693Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_hparams:0 2025-10-10T01:45:00.3589733Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalar:0, line 368 <- wrt source file 2025-10-10T01:45:00.3591785Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalar:0 2025-10-10T01:45:00.3594288Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalars:0, line 400 <- wrt source file 2025-10-10T01:45:00.3596397Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalars:0 2025-10-10T01:45:00.3598421Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_tensor:0, line 447 <- wrt source file 2025-10-10T01:45:00.3600783Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_tensor:0 2025-10-10T01:45:00.3602831Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram:0, line 486 <- wrt source file 2025-10-10T01:45:00.3604960Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram:0 2025-10-10T01:45:00.3607054Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram_raw:0, line 539 <- wrt source file 2025-10-10T01:45:00.3609210Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram_raw:0 2025-10-10T01:45:00.3611274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_image:0, line 605 <- wrt source file 2025-10-10T01:45:00.3613316Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_image:0 2025-10-10T01:45:00.3615339Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_images:0, line 654 <- wrt source file 2025-10-10T01:45:00.3617394Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_images:0 2025-10-10T01:45:00.3619391Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_text:0, line 817 <- wrt source file 2025-10-10T01:45:00.3621423Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_text:0 2025-10-10T01:45:00.3623454Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_embedding:0, line 884 <- wrt source file 2025-10-10T01:45:00.3625559Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_embedding:0 2025-10-10T01:45:00.3627611Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_pr_curve:0, line 996 <- wrt source file 2025-10-10T01:45:00.3629700Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_pr_curve:0 2025-10-10T01:45:00.3631917Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_multilinechart:0, line 1070 <- wrt source file 2025-10-10T01:45:00.3634390Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_multilinechart:0 2025-10-10T01:45:00.3636724Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_marginchart:0, line 1091 <- wrt source file 2025-10-10T01:45:00.3639360Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_marginchart:0 2025-10-10T01:45:00.3641584Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars:0, line 1115 <- wrt source file 2025-10-10T01:45:00.3643773Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars:0 2025-10-10T01:45:00.3645837Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_mesh:0, line 1161 <- wrt source file 2025-10-10T01:45:00.3648159Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_mesh:0 2025-10-10T01:45:00.3650123Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::find_closure_group:0, line 439 <- wrt source file 2025-10-10T01:45:00.7856547Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::find_closure_group:0 2025-10-10T01:45:00.7858742Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::replace_extern_shared:0, line 535 <- wrt source file 2025-10-10T01:45:00.7860822Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::replace_extern_shared:0 2025-10-10T01:45:00.7862733Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::Sampler:0, line 36 <- wrt source file 2025-10-10T01:45:00.7864510Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::Sampler:0 2025-10-10T01:45:00.7866419Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::WeightedRandomSampler:0, line 225 <- wrt source file 2025-10-10T01:45:00.7868469Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::WeightedRandomSampler:0 2025-10-10T01:45:00.7870359Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::BatchSampler:0, line 296 <- wrt source file 2025-10-10T01:45:00.7873291Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::BatchSampler:0 2025-10-10T01:45:00.7875284Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::IterableDataset:0, line 94 <- wrt source file 2025-10-10T01:45:00.7882283Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::IterableDataset:0 2025-10-10T01:45:00.7884246Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::StackDataset:0, line 219 <- wrt source file 2025-10-10T01:45:00.7886153Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::StackDataset:0 2025-10-10T01:45:00.7887935Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::random_split:0, line 441 <- wrt source file 2025-10-10T01:45:00.7889739Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::random_split:0 2025-10-10T01:45:00.7891647Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/distributed.py::DistributedSampler:0, line 55 <- wrt source file 2025-10-10T01:45:00.7893647Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/distributed.py::DistributedSampler:0 2025-10-10T01:45:00.7895527Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_convert:0, line 40 <- wrt source file 2025-10-10T01:45:00.7898035Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_convert:0 2025-10-10T01:45:00.7899967Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::collate:0, line 138 <- wrt source file 2025-10-10T01:45:00.7901827Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::collate:0 2025-10-10T01:45:00.7903953Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_collate:0, line 366 <- wrt source file 2025-10-10T01:45:00.7905898Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_collate:0 2025-10-10T01:45:00.7907907Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::IterDataPipe:0, line 97 <- wrt source file 2025-10-10T01:45:00.7909974Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::IterDataPipe:0 2025-10-10T01:45:00.7911908Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::MapDataPipe:0, line 269 <- wrt source file 2025-10-10T01:45:00.7913877Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::MapDataPipe:0 2025-10-10T01:45:00.7916029Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/common.py::validate_input_col:0, line 37 <- wrt source file 2025-10-10T01:45:00.7918151Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/common.py::validate_input_col:0 2025-10-10T01:45:00.7920223Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/decoder.py::basichandlers:0, line 47 <- wrt source file 2025-10-10T01:45:00.7922324Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/decoder.py::basichandlers:0 2025-10-10T01:45:00.7924367Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/callable.py::MapperMapDataPipe:0, line 36 <- wrt source file 2025-10-10T01:45:00.7926505Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/callable.py::MapperMapDataPipe:0 2025-10-10T01:45:00.7928648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ConcaterMapDataPipe:0, line 29 <- wrt source file 2025-10-10T01:45:00.7930861Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ConcaterMapDataPipe:0 2025-10-10T01:45:00.7932997Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ZipperMapDataPipe:0, line 76 <- wrt source file 2025-10-10T01:45:00.7935199Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ZipperMapDataPipe:0 2025-10-10T01:45:00.7937358Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/utils.py::SequenceWrapperMapDataPipe:0, line 29 <- wrt source file 2025-10-10T01:45:00.7939602Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/utils.py::SequenceWrapperMapDataPipe:0 2025-10-10T01:45:00.7941835Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combinatorics.py::ShufflerIterDataPipe:0, line 34 <- wrt source file 2025-10-10T01:45:00.7944423Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combinatorics.py::ShufflerIterDataPipe:0 2025-10-10T01:45:00.7946620Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/grouping.py::BatcherMapDataPipe:0, line 29 <- wrt source file 2025-10-10T01:45:00.7948767Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/grouping.py::BatcherMapDataPipe:0 2025-10-10T01:45:00.7951175Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::MapperIterDataPipe:0, line 53 <- wrt source file 2025-10-10T01:45:00.7953330Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::MapperIterDataPipe:0 2025-10-10T01:45:00.7955755Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::CollatorIterDataPipe:0, line 202 <- wrt source file 2025-10-10T01:45:00.7957972Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::CollatorIterDataPipe:0 2025-10-10T01:45:00.7960135Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ConcaterIterDataPipe:0, line 38 <- wrt source file 2025-10-10T01:45:00.7962387Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ConcaterIterDataPipe:0 2025-10-10T01:45:00.7964546Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ForkerIterDataPipe:0, line 89 <- wrt source file 2025-10-10T01:45:00.7966703Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ForkerIterDataPipe:0 2025-10-10T01:45:00.7968805Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::_ChildDataPipe:0, line 307 <- wrt source file 2025-10-10T01:45:00.7970908Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::_ChildDataPipe:0 2025-10-10T01:45:00.7973081Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::DemultiplexerIterDataPipe:0, line 394 <- wrt source file 2025-10-10T01:45:00.7975389Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::DemultiplexerIterDataPipe:0 2025-10-10T01:45:00.7977619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::MultiplexerIterDataPipe:0, line 611 <- wrt source file 2025-10-10T01:45:00.7979876Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::MultiplexerIterDataPipe:0 2025-10-10T01:45:00.7982037Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ZipperIterDataPipe:0, line 681 <- wrt source file 2025-10-10T01:45:00.7984213Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ZipperIterDataPipe:0 2025-10-10T01:45:00.7986393Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/filelister.py::FileListerIterDataPipe:0, line 30 <- wrt source file 2025-10-10T01:45:00.7988648Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/filelister.py::FileListerIterDataPipe:0 2025-10-10T01:45:00.7991219Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/streamreader.py::StreamReaderIterDataPipe:0, line 25 <- wrt source file 2025-10-10T01:45:00.7993588Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/streamreader.py::StreamReaderIterDataPipe:0 2025-10-10T01:45:00.7995956Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/utils.py::IterableWrapperIterDataPipe:0, line 29 <- wrt source file 2025-10-10T01:45:00.7998532Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/utils.py::IterableWrapperIterDataPipe:0 2025-10-10T01:45:00.8000737Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/fileopener.py::FileOpenerIterDataPipe:0, line 34 <- wrt source file 2025-10-10T01:45:00.8003007Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/fileopener.py::FileOpenerIterDataPipe:0 2025-10-10T01:45:00.8005238Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combinatorics.py::ShufflerIterDataPipe:0, line 89 <- wrt source file 2025-10-10T01:45:00.8007520Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combinatorics.py::ShufflerIterDataPipe:0 2025-10-10T01:45:00.8009694Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/selecting.py::FilterIterDataPipe:0, line 37 <- wrt source file 2025-10-10T01:45:00.8011868Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/selecting.py::FilterIterDataPipe:0 2025-10-10T01:45:00.8014005Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::BatcherIterDataPipe:0, line 41 <- wrt source file 2025-10-10T01:45:00.8016195Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::BatcherIterDataPipe:0 2025-10-10T01:45:00.8018386Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::UnBatcherIterDataPipe:0, line 101 <- wrt source file 2025-10-10T01:45:00.8020609Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::UnBatcherIterDataPipe:0 2025-10-10T01:45:00.8022831Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::GrouperIterDataPipe:0, line 168 <- wrt source file 2025-10-10T01:45:00.8025034Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::GrouperIterDataPipe:0 2025-10-10T01:45:00.8027206Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn:0, line 32 <- wrt source file 2025-10-10T01:45:00.8029261Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn:0 2025-10-10T01:45:00.8031298Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn_relu:0, line 77 <- wrt source file 2025-10-10T01:45:00.8033415Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn_relu:0 2025-10-10T01:45:00.8035795Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_linear_bn:0, line 131 <- wrt source file 2025-10-10T01:45:00.8038176Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_linear_bn:0 2025-10-10T01:45:00.8040285Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_convtranspose_bn:0, line 164 <- wrt source file 2025-10-10T01:45:00.8042515Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_convtranspose_bn:0 2025-10-10T01:45:00.8044543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_pt2e:0, line 51 <- wrt source file 2025-10-10T01:45:00.8046939Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_pt2e:0 2025-10-10T01:45:00.8048885Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_qat_pt2e:0, line 130 <- wrt source file 2025-10-10T01:45:00.8050905Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_qat_pt2e:0 2025-10-10T01:45:00.8052846Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::convert_pt2e:0, line 228 <- wrt source file 2025-10-10T01:45:00.8054782Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::convert_pt2e:0 2025-10-10T01:45:00.8056648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::get_combined_dict:0, line 173 <- wrt source file 2025-10-10T01:45:00.8058556Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::get_combined_dict:0 2025-10-10T01:45:00.8060404Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_path_of_module:0, line 545 <- wrt source file 2025-10-10T01:45:00.8062350Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_path_of_module:0 2025-10-10T01:45:00.8064247Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_signature_locals:0, line 567 <- wrt source file 2025-10-10T01:45:00.8066213Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_signature_locals:0 2025-10-10T01:45:00.8068120Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_default_kwargs:0, line 581 <- wrt source file 2025-10-10T01:45:00.8070051Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_default_kwargs:0 2025-10-10T01:45:00.8071926Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_normalize_kwargs:0, line 603 <- wrt source file 2025-10-10T01:45:00.8073819Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_normalize_kwargs:0 2025-10-10T01:45:00.8075756Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_num_pos_args:0, line 730 <- wrt source file 2025-10-10T01:45:00.8077609Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_num_pos_args:0 2025-10-10T01:45:00.8079461Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuse_modules.py::fuse_modules:0, line 176 <- wrt source file 2025-10-10T01:45:00.8081390Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuse_modules.py::fuse_modules:0 2025-10-10T01:45:00.8083565Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_args:0, line 110 <- wrt source file 2025-10-10T01:45:00.8085455Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_args:0 2025-10-10T01:45:00.8087323Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_callable_args:0, line 132 <- wrt source file 2025-10-10T01:45:00.8089288Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_callable_args:0 2025-10-10T01:45:00.8091427Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::fuse_fx:0, line 218 <- wrt source file 2025-10-10T01:45:00.8093275Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::fuse_fx:0 2025-10-10T01:45:00.8095103Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_fx:0, line 288 <- wrt source file 2025-10-10T01:45:00.8096981Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_fx:0 2025-10-10T01:45:00.8098861Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_qat_fx:0, line 427 <- wrt source file 2025-10-10T01:45:00.8100802Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_qat_fx:0 2025-10-10T01:45:00.8102678Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_fx:0, line 608 <- wrt source file 2025-10-10T01:45:00.8104546Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_fx:0 2025-10-10T01:45:00.8106497Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_to_reference_fx:0, line 668 <- wrt source file 2025-10-10T01:45:00.8108567Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_to_reference_fx:0 2025-10-10T01:45:00.8110692Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::_convert_to_reference_decomposed_fx:0, line 720 <- wrt source file 2025-10-10T01:45:00.8113145Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::_convert_to_reference_decomposed_fx:0 2025-10-10T01:45:00.8115581Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report.py::ModelReport:0, line 84 <- wrt source file 2025-10-10T01:45:00.8117743Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report.py::ModelReport:0 2025-10-10T01:45:00.8120221Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_filtered_tables:0, line 341 <- wrt source file 2025-10-10T01:45:00.8123206Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_filtered_tables:0 2025-10-10T01:45:00.8126038Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_table_visualization:0, line 429 <- wrt source file 2025-10-10T01:45:00.8129213Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_table_visualization:0 2025-10-10T01:45:00.8132046Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_plot_visualization:0, line 591 <- wrt source file 2025-10-10T01:45:00.8134898Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_plot_visualization:0 2025-10-10T01:45:00.8138047Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_histogram_visualization:0, line 664 <- wrt source file 2025-10-10T01:45:00.8140975Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_histogram_visualization:0 2025-10-10T01:45:00.8143528Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_get_reduction_params:0, line 102 <- wrt source file 2025-10-10T01:45:00.8145769Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_get_reduction_params:0 2025-10-10T01:45:00.8147937Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_register_custom_op:0, line 148 <- wrt source file 2025-10-10T01:45:00.8150145Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_register_custom_op:0 2025-10-10T01:45:00.8152350Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/utils.py::_replace_literals_with_new_placeholders:0, line 439 <- wrt source file 2025-10-10T01:45:00.8155120Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/utils.py::_replace_literals_with_new_placeholders:0 2025-10-10T01:45:00.8157376Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/prepare.py::_get_edge_or_node_to_group_id:0, line 189 <- wrt source file 2025-10-10T01:45:00.8159553Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/prepare.py::_get_edge_or_node_to_group_id:0 2025-10-10T01:45:00.8161726Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/onednn.py::_fuse_linear_bn_leaky_relu:0, line 85 <- wrt source file 2025-10-10T01:45:00.8164195Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/onednn.py::_fuse_linear_bn_leaky_relu:0 2025-10-10T01:45:00.8166845Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/backend_config.py::DTypeConfig:0, line 216 <- wrt source file 2025-10-10T01:45:00.8168362Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/backend_config.py::DTypeConfig:0 2025-10-10T01:45:00.8169698Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py::BaseDataScheduler.get_schedule_param:0, line 98 <- wrt source file 2025-10-10T01:45:00.8171185Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py::BaseDataScheduler.get_schedule_param:0 2025-10-10T01:45:00.8172595Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py::BaseDataSparsifier:0, line 55 <- wrt source file 2025-10-10T01:45:00.8174147Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py::BaseDataSparsifier:0 2025-10-10T01:45:00.8175410Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier:0, line 47 <- wrt source file 2025-10-10T01:45:00.8176597Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier:0 2025-10-10T01:45:00.8177960Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier.squash_mask:0, line 246 <- wrt source file 2025-10-10T01:45:00.8179228Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier.squash_mask:0 2025-10-10T01:45:00.8180411Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/scheduler/lambda_scheduler.py::LambdaSL:0, line 25 <- wrt source file 2025-10-10T01:45:00.8181534Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/scheduler/lambda_scheduler.py::LambdaSL:0 2025-10-10T01:45:00.8182570Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv1d:0, line 211 <- wrt source file 2025-10-10T01:45:00.8183594Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv1d:0 2025-10-10T01:45:00.8184578Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv2d:0, line 283 <- wrt source file 2025-10-10T01:45:00.8185581Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv2d:0 2025-10-10T01:45:00.8186571Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv3d:0, line 359 <- wrt source file 2025-10-10T01:45:00.8187572Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv3d:0 2025-10-10T01:45:00.8188585Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::Quantize:0, line 95 <- wrt source file 2025-10-10T01:45:00.8189654Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::Quantize:0 2025-10-10T01:45:00.8197646Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::DeQuantize:0, line 145 <- wrt source file 2025-10-10T01:45:00.8198952Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::DeQuantize:0 2025-10-10T01:45:00.8200087Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::Embedding:0, line 111 <- wrt source file 2025-10-10T01:45:00.8235109Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::Embedding:0 2025-10-10T01:45:00.8236314Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::EmbeddingBag:0, line 275 <- wrt source file 2025-10-10T01:45:00.8237514Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::EmbeddingBag:0 2025-10-10T01:45:00.8238619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/activation.py::ReLU6:0, line 36 <- wrt source file 2025-10-10T01:45:00.8239915Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/activation.py::ReLU6:0 2025-10-10T01:45:00.8241054Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::FloatFunctional:0, line 23 <- wrt source file 2025-10-10T01:45:00.8242285Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::FloatFunctional:0 2025-10-10T01:45:00.8243474Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::QFunctional:0, line 176 <- wrt source file 2025-10-10T01:45:00.8244853Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::QFunctional:0 2025-10-10T01:45:00.8245947Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/linear.py::Linear:0, line 138 <- wrt source file 2025-10-10T01:45:00.8246993Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/linear.py::Linear:0 2025-10-10T01:45:00.8247986Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv1d:0, line 376 <- wrt source file 2025-10-10T01:45:00.8248995Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv1d:0 2025-10-10T01:45:00.8249979Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv2d:0, line 506 <- wrt source file 2025-10-10T01:45:00.8250977Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv2d:0 2025-10-10T01:45:00.8251962Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv3d:0, line 636 <- wrt source file 2025-10-10T01:45:00.8252980Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv3d:0 2025-10-10T01:45:00.8254027Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose1d:0, line 893 <- wrt source file 2025-10-10T01:45:00.8255132Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose1d:0 2025-10-10T01:45:00.8256233Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose2d:0, line 1015 <- wrt source file 2025-10-10T01:45:00.8257337Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose2d:0 2025-10-10T01:45:00.8258413Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose3d:0, line 1141 <- wrt source file 2025-10-10T01:45:00.8259551Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose3d:0 2025-10-10T01:45:00.8260568Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/rnn.py::LSTM:0, line 24 <- wrt source file 2025-10-10T01:45:00.8261557Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/rnn.py::LSTM:0 2025-10-10T01:45:00.8262598Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/linear.py::Linear:0, line 30 <- wrt source file 2025-10-10T01:45:00.8263708Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/linear.py::Linear:0 2025-10-10T01:45:00.8264916Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv1d:0, line 43 <- wrt source file 2025-10-10T01:45:00.8266013Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv1d:0 2025-10-10T01:45:00.8267095Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv2d:0, line 125 <- wrt source file 2025-10-10T01:45:00.8268172Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv2d:0 2025-10-10T01:45:00.8269368Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv3d:0, line 210 <- wrt source file 2025-10-10T01:45:00.8270441Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv3d:0 2025-10-10T01:45:00.8271543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose1d:0, line 297 <- wrt source file 2025-10-10T01:45:00.8272717Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose1d:0 2025-10-10T01:45:00.8273867Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose2d:0, line 379 <- wrt source file 2025-10-10T01:45:00.8275097Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose2d:0 2025-10-10T01:45:00.8276237Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose3d:0, line 461 <- wrt source file 2025-10-10T01:45:00.8277408Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose3d:0 2025-10-10T01:45:00.8278497Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTM:0, line 515 <- wrt source file 2025-10-10T01:45:00.8279567Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTM:0 2025-10-10T01:45:00.8280611Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRU:0, line 802 <- wrt source file 2025-10-10T01:45:00.8281666Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRU:0 2025-10-10T01:45:00.8282721Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::RNNCell:0, line 1208 <- wrt source file 2025-10-10T01:45:00.8283819Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::RNNCell:0 2025-10-10T01:45:00.8284886Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTMCell:0, line 1275 <- wrt source file 2025-10-10T01:45:00.8285981Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTMCell:0 2025-10-10T01:45:00.8287058Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRUCell:0, line 1328 <- wrt source file 2025-10-10T01:45:00.8288132Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRUCell:0 2025-10-10T01:45:00.8289422Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearReLU:0, line 25 <- wrt source file 2025-10-10T01:45:00.8290645Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearReLU:0 2025-10-10T01:45:00.8291845Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearLeakyReLU:0, line 67 <- wrt source file 2025-10-10T01:45:00.8293084Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearLeakyReLU:0 2025-10-10T01:45:00.8294467Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearTanh:0, line 142 <- wrt source file 2025-10-10T01:45:00.8295669Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearTanh:0 2025-10-10T01:45:00.8296890Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py::LinearReLU:0, line 24 <- wrt source file 2025-10-10T01:45:00.8298156Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py::LinearReLU:0 2025-10-10T01:45:00.8299335Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/qat/modules/linear_relu.py::LinearReLU:0, line 30 <- wrt source file 2025-10-10T01:45:00.8300484Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/qat/modules/linear_relu.py::LinearReLU:0 2025-10-10T01:45:00.8301582Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTMCell:0, line 30 <- wrt source file 2025-10-10T01:45:00.8315681Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTMCell:0 2025-10-10T01:45:00.8316723Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTM:0, line 414 <- wrt source file 2025-10-10T01:45:00.8352915Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTM:0 2025-10-10T01:45:00.8354610Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/benchmark_utils.py::benchmark_utilization:0, line 184 <- wrt source file 2025-10-10T01:45:00.8355744Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/benchmark_utils.py::benchmark_utilization:0 2025-10-10T01:45:00.8356771Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py::aot_function:0, line 770 <- wrt source file 2025-10-10T01:45:00.8650874Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py::aot_function:0 2025-10-10T01:45:00.8651891Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/fx_minifier.py::minifier:0, line 194 <- wrt source file 2025-10-10T01:45:00.8652878Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/fx_minifier.py::minifier:0 2025-10-10T01:45:00.8653873Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/functional_call.py::functional_call:0, line 36 <- wrt source file 2025-10-10T01:45:00.8655451Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/functional_call.py::functional_call:0 2025-10-10T01:45:00.8656465Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::vjp:0, line 234 <- wrt source file 2025-10-10T01:45:00.8698787Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::vjp:0 2025-10-10T01:45:00.8700524Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacrev:0, line 476 <- wrt source file 2025-10-10T01:45:00.8760200Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacrev:0 2025-10-10T01:45:00.8761939Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jvp:0, line 1024 <- wrt source file 2025-10-10T01:45:01.0920250Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jvp:0 2025-10-10T01:45:01.0922443Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacfwd:0, line 1182 <- wrt source file 2025-10-10T01:45:01.0976965Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacfwd:0 2025-10-10T01:45:01.0978816Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::hessian:0, line 1342 <- wrt source file 2025-10-10T01:45:01.0994864Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::hessian:0 2025-10-10T01:45:01.0996068Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::functionalize:0, line 1506 <- wrt source file 2025-10-10T01:45:01.0997101Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::functionalize:0 2025-10-10T01:45:01.0997979Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::linearize:0, line 1705 <- wrt source file 2025-10-10T01:45:01.1143509Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::linearize:0 2025-10-10T01:45:01.1145709Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::CompilerWrapper.post_compile:0, line 1110 <- wrt source file 2025-10-10T01:45:01.1147972Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::CompilerWrapper.post_compile:0 2025-10-10T01:45:01.1150163Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::InductorWrapper.post_compile:0, line 1165 <- wrt source file 2025-10-10T01:45:01.1152381Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::InductorWrapper.post_compile:0 2025-10-10T01:45:01.1154728Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::_KinetoProfile.toggle_collection_dynamic:0, line 300 <- wrt source file 2025-10-10T01:45:01.1156907Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::_KinetoProfile.toggle_collection_dynamic:0 2025-10-10T01:45:01.1158787Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::profile:0, line 622 <- wrt source file 2025-10-10T01:45:01.1160520Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::profile:0 2025-10-10T01:45:01.1162431Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/generalized_pareto.py::GeneralizedPareto:0, line 26 <- wrt source file 2025-10-10T01:45:01.1164592Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/generalized_pareto.py::GeneralizedPareto:0 2025-10-10T01:45:01.1166888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gamma.py::Gamma:0, line 24 <- wrt source file 2025-10-10T01:45:01.1168665Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gamma.py::Gamma:0 2025-10-10T01:45:01.1170452Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/weibull.py::Weibull:0, line 22 <- wrt source file 2025-10-10T01:45:01.1172228Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/weibull.py::Weibull:0 2025-10-10T01:45:01.1174319Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/exponential.py::Exponential:0, line 20 <- wrt source file 2025-10-10T01:45:01.1176267Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/exponential.py::Exponential:0 2025-10-10T01:45:01.1178064Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/cauchy.py::Cauchy:0, line 23 <- wrt source file 2025-10-10T01:45:01.1179806Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/cauchy.py::Cauchy:0 2025-10-10T01:45:01.1181676Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_bernoulli.py::RelaxedBernoulli:0, line 130 <- wrt source file 2025-10-10T01:45:01.1183738Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_bernoulli.py::RelaxedBernoulli:0 2025-10-10T01:45:01.1185605Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/poisson.py::Poisson:0, line 25 <- wrt source file 2025-10-10T01:45:01.1187356Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/poisson.py::Poisson:0 2025-10-10T01:45:01.1189065Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/laplace.py::Laplace:0, line 20 <- wrt source file 2025-10-10T01:45:01.1190816Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/laplace.py::Laplace:0 2025-10-10T01:45:01.1192727Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multivariate_normal.py::MultivariateNormal:0, line 103 <- wrt source file 2025-10-10T01:45:01.1194963Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multivariate_normal.py::MultivariateNormal:0 2025-10-10T01:45:01.1197075Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/continuous_bernoulli.py::ContinuousBernoulli:0, line 35 <- wrt source file 2025-10-10T01:45:01.1199237Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/continuous_bernoulli.py::ContinuousBernoulli:0 2025-10-10T01:45:01.1201283Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/fishersnedecor.py::FisherSnedecor:0, line 21 <- wrt source file 2025-10-10T01:45:01.1203291Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/fishersnedecor.py::FisherSnedecor:0 2025-10-10T01:45:01.1205139Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/bernoulli.py::Bernoulli:0, line 30 <- wrt source file 2025-10-10T01:45:01.1206962Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/bernoulli.py::Bernoulli:0 2025-10-10T01:45:01.1208674Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/beta.py::Beta:0, line 21 <- wrt source file 2025-10-10T01:45:01.1210334Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/beta.py::Beta:0 2025-10-10T01:45:01.1212573Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_categorical.py::RelaxedOneHotCategorical:0, line 117 <- wrt source file 2025-10-10T01:45:01.1214819Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_categorical.py::RelaxedOneHotCategorical:0 2025-10-10T01:45:01.1216745Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/von_mises.py::VonMises:0, line 119 <- wrt source file 2025-10-10T01:45:01.1218817Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/von_mises.py::VonMises:0 2025-10-10T01:45:01.1220597Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_cauchy.py::HalfCauchy:0, line 24 <- wrt source file 2025-10-10T01:45:01.1223624Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_cauchy.py::HalfCauchy:0 2025-10-10T01:45:01.1225409Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/binomial.py::Binomial:0, line 31 <- wrt source file 2025-10-10T01:45:01.1229760Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/binomial.py::Binomial:0 2025-10-10T01:45:01.1231582Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/wishart.py::Wishart:0, line 39 <- wrt source file 2025-10-10T01:45:01.1233407Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/wishart.py::Wishart:0 2025-10-10T01:45:01.1235349Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/inverse_gamma.py::InverseGamma:0, line 24 <- wrt source file 2025-10-10T01:45:01.1237309Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/inverse_gamma.py::InverseGamma:0 2025-10-10T01:45:01.1239131Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/dirichlet.py::Dirichlet:0, line 44 <- wrt source file 2025-10-10T01:45:01.1240939Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/dirichlet.py::Dirichlet:0 2025-10-10T01:45:01.1242840Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/mixture_same_family.py::MixtureSameFamily:0, line 24 <- wrt source file 2025-10-10T01:45:01.1244962Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/mixture_same_family.py::MixtureSameFamily:0 2025-10-10T01:45:01.1246931Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CatTransform:0, line 1076 <- wrt source file 2025-10-10T01:45:01.1248834Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CatTransform:0 2025-10-10T01:45:01.1250701Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::StackTransform:0, line 1190 <- wrt source file 2025-10-10T01:45:01.1252623Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::StackTransform:0 2025-10-10T01:45:01.1254678Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CumulativeDistributionTransform:0, line 1268 <- wrt source file 2025-10-10T01:45:01.1256910Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CumulativeDistributionTransform:0 2025-10-10T01:45:01.1258900Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::is_dependent:0, line 167 <- wrt source file 2025-10-10T01:45:01.1261156Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::is_dependent:0 2025-10-10T01:45:01.1263088Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::_DependentProperty:0, line 188 <- wrt source file 2025-10-10T01:45:01.1265093Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::_DependentProperty:0 2025-10-10T01:45:01.1267230Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/utils.py::clamp_probs:0, line 114 <- wrt source file 2025-10-10T01:45:01.1269001Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/utils.py::clamp_probs:0 2025-10-10T01:45:01.1270757Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/geometric.py::Geometric:0, line 36 <- wrt source file 2025-10-10T01:45:01.1272578Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/geometric.py::Geometric:0 2025-10-10T01:45:01.1274409Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/uniform.py::Uniform:0, line 21 <- wrt source file 2025-10-10T01:45:01.1276163Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/uniform.py::Uniform:0 2025-10-10T01:45:01.1277879Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/studentT.py::StudentT:0, line 22 <- wrt source file 2025-10-10T01:45:01.1279667Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/studentT.py::StudentT:0 2025-10-10T01:45:01.1281552Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py::OneHotCategorical:0, line 34 <- wrt source file 2025-10-10T01:45:01.1285302Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py::OneHotCategorical:0 2025-10-10T01:45:01.1287597Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lowrank_multivariate_normal.py::LowRankMultivariateNormal:0, line 63 <- wrt source file 2025-10-10T01:45:01.1290081Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lowrank_multivariate_normal.py::LowRankMultivariateNormal:0 2025-10-10T01:45:01.1292158Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/normal.py::Normal:0, line 22 <- wrt source file 2025-10-10T01:45:01.1293899Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/normal.py::Normal:0 2025-10-10T01:45:01.1295597Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gumbel.py::Gumbel:0, line 23 <- wrt source file 2025-10-10T01:45:01.1297332Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gumbel.py::Gumbel:0 2025-10-10T01:45:01.1299083Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lkj_cholesky.py::LKJCholesky:0, line 43 <- wrt source file 2025-10-10T01:45:01.1355597Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lkj_cholesky.py::LKJCholesky:0 2025-10-10T01:45:01.1357577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/kumaraswamy.py::Kumaraswamy:0, line 30 <- wrt source file 2025-10-10T01:45:01.1359519Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/kumaraswamy.py::Kumaraswamy:0 2025-10-10T01:45:01.1361789Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_normal.py::HalfNormal:0, line 24 <- wrt source file 2025-10-10T01:45:01.1363691Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_normal.py::HalfNormal:0 2025-10-10T01:45:01.1365514Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/categorical.py::Categorical:0, line 42 <- wrt source file 2025-10-10T01:45:01.1367374Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/categorical.py::Categorical:0 2025-10-10T01:45:01.1369499Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multinomial.py::Multinomial:0, line 38 <- wrt source file 2025-10-10T01:45:01.1371351Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multinomial.py::Multinomial:0 2025-10-10T01:45:01.1373168Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/independent.py::Independent:0, line 27 <- wrt source file 2025-10-10T01:45:01.1375024Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/independent.py::Independent:0 2025-10-10T01:45:01.1376773Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/pareto.py::Pareto:0, line 20 <- wrt source file 2025-10-10T01:45:01.1378495Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/pareto.py::Pareto:0 2025-10-10T01:45:01.1380318Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/logistic_normal.py::LogisticNormal:0, line 28 <- wrt source file 2025-10-10T01:45:01.1382304Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/logistic_normal.py::LogisticNormal:0 2025-10-10T01:45:01.1384182Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/log_normal.py::LogNormal:0, line 23 <- wrt source file 2025-10-10T01:45:01.1385979Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/log_normal.py::LogNormal:0 2025-10-10T01:45:01.1387665Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/chi2.py::Chi2:0, line 18 <- wrt source file 2025-10-10T01:45:01.1389313Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/chi2.py::Chi2:0 2025-10-10T01:45:01.1390944Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_logging/_internal.py::set_logs:0, line 460 <- wrt source file 2025-10-10T01:45:01.1392656Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_logging/_internal.py::set_logs:0 2025-10-10T01:45:01.1394661Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/cpp_builder.py::get_name_and_dir_from_output_file_path:0, line 1795 <- wrt source file 2025-10-10T01:45:01.1396762Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/cpp_builder.py::get_name_and_dir_from_output_file_path:0 2025-10-10T01:45:01.1398764Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py::add_preprocessing_fn:0, line 3763 <- wrt source file 2025-10-10T01:45:01.1400743Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py::add_preprocessing_fn:0 2025-10-10T01:45:01.1402630Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/codecache.py::WritableTempFile:0, line 391 <- wrt source file 2025-10-10T01:45:01.1404477Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/codecache.py::WritableTempFile:0 2025-10-10T01:45:01.1406730Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_lock_with_timeout:0, line 55 <- wrt source file 2025-10-10T01:45:01.1408875Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_lock_with_timeout:0 2025-10-10T01:45:01.1410989Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_lock_with_timeout:0, line 93 <- wrt source file 2025-10-10T01:45:01.1413486Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_lock_with_timeout:0 2025-10-10T01:45:01.1415622Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_flock_with_timeout:0, line 130 <- wrt source file 2025-10-10T01:45:01.1417753Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_flock_with_timeout:0 2025-10-10T01:45:01.1419901Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_flock_with_timeout:0, line 169 <- wrt source file 2025-10-10T01:45:01.1422134Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_flock_with_timeout:0 2025-10-10T01:45:01.1424326Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/template_heuristics/registry.py::register_template_heuristic:0, line 54 <- wrt source file 2025-10-10T01:45:01.1426569Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/template_heuristics/registry.py::register_template_heuristic:0 2025-10-10T01:45:01.1428556Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/decorators.py::substitute_in_graph:0, line 359 <- wrt source file 2025-10-10T01:45:01.1430419Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/decorators.py::substitute_in_graph:0 2025-10-10T01:45:01.1432329Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/base.py::VariableTracker.python_type:0, line 322 <- wrt source file 2025-10-10T01:45:01.1434619Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/base.py::VariableTracker.python_type:0 2025-10-10T01:45:01.1436610Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/semi_structured.py::to_sparse_semi_structured:0, line 340 <- wrt source file 2025-10-10T01:45:01.1438583Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/semi_structured.py::to_sparse_semi_structured:0 2025-10-10T01:45:01.1440548Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool2d_with_indices:0, line 470 <- wrt source file 2025-10-10T01:45:01.1454847Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool2d_with_indices:0 2025-10-10T01:45:01.1456893Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool3d_with_indices:0, line 589 <- wrt source file 2025-10-10T01:45:01.2159188Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool3d_with_indices:0 2025-10-10T01:45:01.2197337Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::gumbel_softmax:0, line 2197 <- wrt source file 2025-10-10T01:45:01.2208027Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::gumbel_softmax:0 2025-10-10T01:45:01.2209776Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding:0, line 2502 <- wrt source file 2025-10-10T01:45:01.2216995Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding:0 2025-10-10T01:45:01.2218767Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding_bag:0, line 2642 <- wrt source file 2025-10-10T01:45:01.2229170Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding_bag:0 2025-10-10T01:45:01.2230877Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::ctc_loss:0, line 3080 <- wrt source file 2025-10-10T01:45:01.2307348Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::ctc_loss:0 2025-10-10T01:45:01.2309059Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::nll_loss:0, line 3150 <- wrt source file 2025-10-10T01:45:01.2316112Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::nll_loss:0 2025-10-10T01:45:01.2317805Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::cross_entropy:0, line 3467 <- wrt source file 2025-10-10T01:45:01.2327901Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::cross_entropy:0 2025-10-10T01:45:01.2329684Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy:0, line 3533 <- wrt source file 2025-10-10T01:45:01.2334759Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy:0 2025-10-10T01:45:01.2336636Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy_with_logits:0, line 3604 <- wrt source file 2025-10-10T01:45:01.2342983Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy_with_logits:0 2025-10-10T01:45:01.2344775Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::pad:0, line 5375 <- wrt source file 2025-10-10T01:45:01.2354909Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::pad:0 2025-10-10T01:45:01.2356541Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_input:0, line 32 <- wrt source file 2025-10-10T01:45:01.2364110Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_input:0 2025-10-10T01:45:01.2365318Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_weight:0, line 79 <- wrt source file 2025-10-10T01:45:01.2369092Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_weight:0 2025-10-10T01:45:01.2370291Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_input:0, line 130 <- wrt source file 2025-10-10T01:45:01.2376995Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_input:0 2025-10-10T01:45:01.2378162Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_weight:0, line 177 <- wrt source file 2025-10-10T01:45:01.2382588Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_weight:0 2025-10-10T01:45:01.2383764Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_input:0, line 228 <- wrt source file 2025-10-10T01:45:01.2422563Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_input:0 2025-10-10T01:45:01.2423874Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_weight:0, line 275 <- wrt source file 2025-10-10T01:45:01.2445320Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_weight:0 2025-10-10T01:45:01.2446822Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::calculate_gain:0, line 172 <- wrt source file 2025-10-10T01:45:01.2448822Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::calculate_gain:0 2025-10-10T01:45:01.2450268Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::uniform_:0, line 231 <- wrt source file 2025-10-10T01:45:01.2452395Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::uniform_:0 2025-10-10T01:45:01.2453801Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::normal_:0, line 258 <- wrt source file 2025-10-10T01:45:01.2455941Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::normal_:0 2025-10-10T01:45:01.2457383Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::trunc_normal_:0, line 293 <- wrt source file 2025-10-10T01:45:01.2460344Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::trunc_normal_:0 2025-10-10T01:45:01.2461798Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::constant_:0, line 307 <- wrt source file 2025-10-10T01:45:01.2463897Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::constant_:0 2025-10-10T01:45:01.2464941Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::ones_:0, line 324 <- wrt source file 2025-10-10T01:45:01.2467210Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::ones_:0 2025-10-10T01:45:01.2468236Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::zeros_:0, line 337 <- wrt source file 2025-10-10T01:45:01.2470545Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::zeros_:0 2025-10-10T01:45:01.2471610Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::eye_:0, line 353 <- wrt source file 2025-10-10T01:45:01.2474019Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::eye_:0 2025-10-10T01:45:01.2475111Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::dirac_:0, line 375 <- wrt source file 2025-10-10T01:45:01.2479527Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::dirac_:0 2025-10-10T01:45:01.2480581Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_uniform_:0, line 461 <- wrt source file 2025-10-10T01:45:01.2483363Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_uniform_:0 2025-10-10T01:45:01.2484481Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_normal_:0, line 493 <- wrt source file 2025-10-10T01:45:01.2486851Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_normal_:0 2025-10-10T01:45:01.2487965Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_uniform_:0, line 545 <- wrt source file 2025-10-10T01:45:01.2490821Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_uniform_:0 2025-10-10T01:45:01.2492064Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_normal_:0, line 610 <- wrt source file 2025-10-10T01:45:01.2494271Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_normal_:0 2025-10-10T01:45:01.2495358Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::orthogonal_:0, line 649 <- wrt source file 2025-10-10T01:45:01.2496624Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::orthogonal_:0 2025-10-10T01:45:01.2497662Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::sparse_:0, line 702 <- wrt source file 2025-10-10T01:45:01.2499916Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::sparse_:0 2025-10-10T01:45:01.2501065Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/__init__.py::sdpa_kernel:0, line 120 <- wrt source file 2025-10-10T01:45:01.2502246Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/__init__.py::sdpa_kernel:0 2025-10-10T01:45:01.2503233Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential:0, line 81 <- wrt source file 2025-10-10T01:45:01.2504239Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential:0 2025-10-10T01:45:01.2505236Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.append:0, line 261 <- wrt source file 2025-10-10T01:45:01.2513175Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.append:0 2025-10-10T01:45:01.2514793Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.insert:0, line 284 <- wrt source file 2025-10-10T01:45:01.2520878Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.insert:0 2025-10-10T01:45:01.2522291Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.extend:0, line 315 <- wrt source file 2025-10-10T01:45:01.2530063Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.extend:0 2025-10-10T01:45:01.2531426Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleList:0, line 344 <- wrt source file 2025-10-10T01:45:01.2532760Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleList:0 2025-10-10T01:45:01.2534088Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleDict:0, line 525 <- wrt source file 2025-10-10T01:45:01.2535445Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleDict:0 2025-10-10T01:45:01.2536804Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterList:0, line 657 <- wrt source file 2025-10-10T01:45:01.2538192Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterList:0 2025-10-10T01:45:01.2539536Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterDict:0, line 815 <- wrt source file 2025-10-10T01:45:01.2540904Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterDict:0 2025-10-10T01:45:01.2542473Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Flatten:0, line 30 <- wrt source file 2025-10-10T01:45:01.2543515Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Flatten:0 2025-10-10T01:45:01.2544469Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Unflatten:0, line 87 <- wrt source file 2025-10-10T01:45:01.2558579Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Unflatten:0 2025-10-10T01:45:01.2560266Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer:0, line 91 <- wrt source file 2025-10-10T01:45:02.2337369Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer:0 2025-10-10T01:45:02.2351934Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer.forward:0, line 267 <- wrt source file 2025-10-10T01:45:02.2353849Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer.forward:0 2025-10-10T01:45:02.2355684Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoder:0, line 345 <- wrt source file 2025-10-10T01:45:02.3694221Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoder:0 2025-10-10T01:45:02.3698612Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoder:0, line 576 <- wrt source file 2025-10-10T01:45:02.6496836Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoder:0 2025-10-10T01:45:02.6504377Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoderLayer:0, line 700 <- wrt source file 2025-10-10T01:45:02.6907173Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoderLayer:0 2025-10-10T01:45:02.7186529Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoderLayer:0, line 1011 <- wrt source file 2025-10-10T01:45:02.8322469Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoderLayer:0 2025-10-10T01:45:02.8324023Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm1d:0, line 341 <- wrt source file 2025-10-10T01:45:02.8335281Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm1d:0 2025-10-10T01:45:02.8336351Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm2d:0, line 452 <- wrt source file 2025-10-10T01:45:02.8595831Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm2d:0 2025-10-10T01:45:02.8598181Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm3d:0, line 563 <- wrt source file 2025-10-10T01:45:03.0382200Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm3d:0 2025-10-10T01:45:03.0542678Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm:0, line 687 <- wrt source file 2025-10-10T01:45:03.0545725Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm:0 2025-10-10T01:45:03.0548543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm.convert_sync_batchnorm:0, line 854 <- wrt source file 2025-10-10T01:45:03.0550897Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm.convert_sync_batchnorm:0 2025-10-10T01:45:03.0553020Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Threshold:0, line 72 <- wrt source file 2025-10-10T01:45:03.0555378Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Threshold:0 2025-10-10T01:45:03.0557135Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU:0, line 120 <- wrt source file 2025-10-10T01:45:03.0559719Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU:0 2025-10-10T01:45:03.0561409Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::RReLU:0, line 185 <- wrt source file 2025-10-10T01:45:03.0564858Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::RReLU:0 2025-10-10T01:45:03.0566890Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardtanh:0, line 247 <- wrt source file 2025-10-10T01:45:03.0569614Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardtanh:0 2025-10-10T01:45:03.0571390Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU6:0, line 318 <- wrt source file 2025-10-10T01:45:03.0573494Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU6:0 2025-10-10T01:45:03.0575197Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Sigmoid:0, line 349 <- wrt source file 2025-10-10T01:45:03.0577686Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Sigmoid:0 2025-10-10T01:45:03.0579478Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardsigmoid:0, line 384 <- wrt source file 2025-10-10T01:45:03.0581714Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardsigmoid:0 2025-10-10T01:45:03.0583441Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanh:0, line 420 <- wrt source file 2025-10-10T01:45:03.0585610Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanh:0 2025-10-10T01:45:03.0587281Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SiLU:0, line 456 <- wrt source file 2025-10-10T01:45:03.0589711Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SiLU:0 2025-10-10T01:45:03.0591357Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Mish:0, line 501 <- wrt source file 2025-10-10T01:45:03.0593683Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Mish:0 2025-10-10T01:45:03.0595720Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardswish:0, line 552 <- wrt source file 2025-10-10T01:45:03.0598223Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardswish:0 2025-10-10T01:45:03.0600293Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ELU:0, line 598 <- wrt source file 2025-10-10T01:45:03.0602403Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ELU:0 2025-10-10T01:45:03.0604083Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::CELU:0, line 646 <- wrt source file 2025-10-10T01:45:03.0606686Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::CELU:0 2025-10-10T01:45:03.0610573Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SELU:0, line 705 <- wrt source file 2025-10-10T01:45:03.0611791Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SELU:0 2025-10-10T01:45:03.0612985Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GLU:0, line 751 <- wrt source file 2025-10-10T01:45:03.0614589Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GLU:0 2025-10-10T01:45:03.0615768Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GELU:0, line 799 <- wrt source file 2025-10-10T01:45:03.0625248Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GELU:0 2025-10-10T01:45:03.0627594Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardshrink:0, line 848 <- wrt source file 2025-10-10T01:45:03.0629789Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardshrink:0 2025-10-10T01:45:03.0631042Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LeakyReLU:0, line 903 <- wrt source file 2025-10-10T01:45:03.0633244Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LeakyReLU:0 2025-10-10T01:45:03.0634753Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSigmoid:0, line 945 <- wrt source file 2025-10-10T01:45:03.0638409Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSigmoid:0 2025-10-10T01:45:03.0639629Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softplus:0, line 981 <- wrt source file 2025-10-10T01:45:03.0642420Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softplus:0 2025-10-10T01:45:03.0643648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softshrink:0, line 1030 <- wrt source file 2025-10-10T01:45:03.0646564Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softshrink:0 2025-10-10T01:45:03.0647845Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::MultiheadAttention:0, line 1148 <- wrt source file 2025-10-10T01:45:03.0649194Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::MultiheadAttention:0 2025-10-10T01:45:03.0650437Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::PReLU:0, line 1613 <- wrt source file 2025-10-10T01:45:03.0652219Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::PReLU:0 2025-10-10T01:45:03.0653635Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softsign:0, line 1664 <- wrt source file 2025-10-10T01:45:03.0656883Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softsign:0 2025-10-10T01:45:03.0658092Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanhshrink:0, line 1690 <- wrt source file 2025-10-10T01:45:03.0660930Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanhshrink:0 2025-10-10T01:45:03.0662376Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmin:0, line 1728 <- wrt source file 2025-10-10T01:45:03.0665855Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmin:0 2025-10-10T01:45:03.0667053Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax:0, line 1792 <- wrt source file 2025-10-10T01:45:03.0670551Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax:0 2025-10-10T01:45:03.0671764Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax2d:0, line 1839 <- wrt source file 2025-10-10T01:45:03.0675418Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax2d:0 2025-10-10T01:45:03.0676660Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSoftmax:0, line 1878 <- wrt source file 2025-10-10T01:45:03.0680140Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSoftmax:0 2025-10-10T01:45:03.0681356Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad1d:0, line 70 <- wrt source file 2025-10-10T01:45:03.0687462Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad1d:0 2025-10-10T01:45:03.0688689Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad2d:0, line 123 <- wrt source file 2025-10-10T01:45:03.0712822Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad2d:0 2025-10-10T01:45:03.0714052Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad3d:0, line 189 <- wrt source file 2025-10-10T01:45:03.6291503Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad3d:0 2025-10-10T01:45:03.6915396Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad1d:0, line 244 <- wrt source file 2025-10-10T01:45:03.6927041Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad1d:0 2025-10-10T01:45:03.6929146Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad2d:0, line 298 <- wrt source file 2025-10-10T01:45:03.6934693Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad2d:0 2025-10-10T01:45:03.6936824Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad3d:0, line 355 <- wrt source file 2025-10-10T01:45:03.7397076Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad3d:0 2025-10-10T01:45:03.7398838Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad1d:0, line 401 <- wrt source file 2025-10-10T01:45:03.7407216Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad1d:0 2025-10-10T01:45:03.7409128Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad2d:0, line 446 <- wrt source file 2025-10-10T01:45:03.7418017Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad2d:0 2025-10-10T01:45:03.7420168Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad3d:0, line 505 <- wrt source file 2025-10-10T01:45:03.7424292Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad3d:0 2025-10-10T01:45:03.7425818Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad1d:0, line 565 <- wrt source file 2025-10-10T01:45:03.7432127Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad1d:0 2025-10-10T01:45:03.7433715Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad2d:0, line 610 <- wrt source file 2025-10-10T01:45:03.7439793Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad2d:0 2025-10-10T01:45:03.7441363Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad3d:0, line 669 <- wrt source file 2025-10-10T01:45:04.1957864Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad3d:0 2025-10-10T01:45:04.2582260Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad1d:0, line 704 <- wrt source file 2025-10-10T01:45:04.2594579Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad1d:0 2025-10-10T01:45:04.2596631Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad2d:0, line 762 <- wrt source file 2025-10-10T01:45:04.2601918Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad2d:0 2025-10-10T01:45:04.2603985Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad3d:0, line 824 <- wrt source file 2025-10-10T01:45:04.2626441Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad3d:0 2025-10-10T01:45:04.2628533Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelShuffle:0, line 40 <- wrt source file 2025-10-10T01:45:04.2635761Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelShuffle:0 2025-10-10T01:45:04.2637716Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelUnshuffle:0, line 99 <- wrt source file 2025-10-10T01:45:04.2652466Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelUnshuffle:0 2025-10-10T01:45:04.2654433Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::PairwiseDistance:0, line 38 <- wrt source file 2025-10-10T01:45:04.2659272Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::PairwiseDistance:0 2025-10-10T01:45:04.2661133Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::CosineSimilarity:0, line 81 <- wrt source file 2025-10-10T01:45:04.2667233Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::CosineSimilarity:0 2025-10-10T01:45:04.2669180Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/channelshuffle.py::ChannelShuffle:0, line 21 <- wrt source file 2025-10-10T01:45:04.2687453Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/channelshuffle.py::ChannelShuffle:0 2025-10-10T01:45:04.2689713Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding:0, line 71 <- wrt source file 2025-10-10T01:45:04.2701956Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding:0 2025-10-10T01:45:04.2703877Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding.from_pretrained:0, line 243 <- wrt source file 2025-10-10T01:45:04.2708430Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding.from_pretrained:0 2025-10-10T01:45:04.2710270Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag:0, line 322 <- wrt source file 2025-10-10T01:45:04.2725931Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag:0 2025-10-10T01:45:04.2727911Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag.from_pretrained:0, line 521 <- wrt source file 2025-10-10T01:45:04.2732978Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag.from_pretrained:0 2025-10-10T01:45:04.2734878Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/lazy.py::LazyModuleMixin:0, line 77 <- wrt source file 2025-10-10T01:45:04.2737562Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/lazy.py::LazyModuleMixin:0 2025-10-10T01:45:04.2739224Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Fold:0, line 224 <- wrt source file 2025-10-10T01:45:04.2742719Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Fold:0 2025-10-10T01:45:04.2744329Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Unfold:0, line 395 <- wrt source file 2025-10-10T01:45:04.2761640Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Unfold:0 2025-10-10T01:45:04.2763967Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LocalResponseNorm:0, line 38 <- wrt source file 2025-10-10T01:45:04.3024930Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LocalResponseNorm:0 2025-10-10T01:45:04.3026956Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LayerNorm:0, line 163 <- wrt source file 2025-10-10T01:45:04.3158021Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LayerNorm:0 2025-10-10T01:45:04.3159979Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::GroupNorm:0, line 274 <- wrt source file 2025-10-10T01:45:04.3176731Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::GroupNorm:0 2025-10-10T01:45:04.3178734Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::RMSNorm:0, line 367 <- wrt source file 2025-10-10T01:45:04.3188063Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::RMSNorm:0 2025-10-10T01:45:04.3190987Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Identity:0, line 34 <- wrt source file 2025-10-10T01:45:04.3198096Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Identity:0 2025-10-10T01:45:04.3199075Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Linear:0, line 83 <- wrt source file 2025-10-10T01:45:04.3208019Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Linear:0 2025-10-10T01:45:04.3209061Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Bilinear:0, line 191 <- wrt source file 2025-10-10T01:45:04.3259050Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Bilinear:0 2025-10-10T01:45:04.3260320Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.register_buffer:0, line 554 <- wrt source file 2025-10-10T01:45:04.3261652Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.register_buffer:0 2025-10-10T01:45:04.3262900Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.apply:0, line 1048 <- wrt source file 2025-10-10T01:45:04.3283160Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.apply:0 2025-10-10T01:45:04.3284347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.to:0, line 1299 <- wrt source file 2025-10-10T01:45:04.3294436Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.to:0 2025-10-10T01:45:04.3295657Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.state_dict:0, line 2238 <- wrt source file 2025-10-10T01:45:04.3296939Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.state_dict:0 2025-10-10T01:45:04.3298183Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.parameters:0, line 2682 <- wrt source file 2025-10-10T01:45:04.3299480Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.parameters:0 2025-10-10T01:45:04.3300862Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_parameters:0, line 2710 <- wrt source file 2025-10-10T01:45:04.3302314Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_parameters:0 2025-10-10T01:45:04.3303682Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.buffers:0, line 2737 <- wrt source file 2025-10-10T01:45:04.3305025Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.buffers:0 2025-10-10T01:45:04.3306377Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_buffers:0, line 2764 <- wrt source file 2025-10-10T01:45:04.3307793Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_buffers:0 2025-10-10T01:45:04.3309175Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_children:0, line 2795 <- wrt source file 2025-10-10T01:45:04.3310672Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_children:0 2025-10-10T01:45:04.3311732Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.modules:0, line 2819 <- wrt source file 2025-10-10T01:45:04.3312766Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.modules:0 2025-10-10T01:45:04.3313800Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_modules:0, line 2857 <- wrt source file 2025-10-10T01:45:04.3316465Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_modules:0 2025-10-10T01:45:04.3317475Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout:0, line 60 <- wrt source file 2025-10-10T01:45:04.3323437Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout:0 2025-10-10T01:45:04.3324778Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout1d:0, line 108 <- wrt source file 2025-10-10T01:45:04.3330135Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout1d:0 2025-10-10T01:45:04.3331442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout2d:0, line 163 <- wrt source file 2025-10-10T01:45:04.3365339Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout2d:0 2025-10-10T01:45:04.3366535Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout3d:0, line 211 <- wrt source file 2025-10-10T01:45:04.3436619Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout3d:0 2025-10-10T01:45:04.3438060Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::AlphaDropout:0, line 257 <- wrt source file 2025-10-10T01:45:04.3441792Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::AlphaDropout:0 2025-10-10T01:45:04.3443306Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::FeatureAlphaDropout:0, line 309 <- wrt source file 2025-10-10T01:45:04.3576139Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::FeatureAlphaDropout:0 2025-10-10T01:45:04.3577973Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm1d:0, line 187 <- wrt source file 2025-10-10T01:45:04.3588815Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm1d:0 2025-10-10T01:45:04.3590395Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm2d:0, line 303 <- wrt source file 2025-10-10T01:45:04.3746102Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm2d:0 2025-10-10T01:45:04.3747873Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm3d:0, line 419 <- wrt source file 2025-10-10T01:45:04.5242828Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm3d:0 2025-10-10T01:45:04.5410363Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool1d:0, line 129 <- wrt source file 2025-10-10T01:45:04.5425879Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool1d:0 2025-10-10T01:45:04.5427440Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool2d:0, line 207 <- wrt source file 2025-10-10T01:45:04.5470328Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool2d:0 2025-10-10T01:45:04.5471797Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool3d:0, line 291 <- wrt source file 2025-10-10T01:45:04.6541469Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool3d:0 2025-10-10T01:45:04.6543605Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool1d:0, line 366 <- wrt source file 2025-10-10T01:45:04.6555041Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool1d:0 2025-10-10T01:45:04.6556369Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool2d:0, line 452 <- wrt source file 2025-10-10T01:45:04.6578322Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool2d:0 2025-10-10T01:45:04.6579576Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool3d:0, line 550 <- wrt source file 2025-10-10T01:45:04.6985030Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool3d:0 2025-10-10T01:45:04.6987216Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool1d:0, line 642 <- wrt source file 2025-10-10T01:45:04.6995568Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool1d:0 2025-10-10T01:45:04.6997707Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool2d:0, line 738 <- wrt source file 2025-10-10T01:45:04.7054354Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool2d:0 2025-10-10T01:45:04.7056158Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool3d:0, line 855 <- wrt source file 2025-10-10T01:45:04.9147926Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool3d:0 2025-10-10T01:45:04.9150310Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool2d:0, line 946 <- wrt source file 2025-10-10T01:45:04.9211332Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool2d:0 2025-10-10T01:45:04.9213291Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool3d:0, line 1033 <- wrt source file 2025-10-10T01:45:04.9711841Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool3d:0 2025-10-10T01:45:04.9713155Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool1d:0, line 1156 <- wrt source file 2025-10-10T01:45:04.9721529Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool1d:0 2025-10-10T01:45:04.9722741Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool2d:0, line 1212 <- wrt source file 2025-10-10T01:45:04.9764472Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool2d:0 2025-10-10T01:45:04.9765703Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool3d:0, line 1276 <- wrt source file 2025-10-10T01:45:05.1894003Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool3d:0 2025-10-10T01:45:05.1895400Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool1d:0, line 1332 <- wrt source file 2025-10-10T01:45:05.1904892Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool1d:0 2025-10-10T01:45:05.1906940Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool2d:0, line 1367 <- wrt source file 2025-10-10T01:45:05.1920536Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool2d:0 2025-10-10T01:45:05.1922484Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool3d:0, line 1411 <- wrt source file 2025-10-10T01:45:05.1940336Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool3d:0 2025-10-10T01:45:05.1941810Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool1d:0, line 1459 <- wrt source file 2025-10-10T01:45:05.1947093Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool1d:0 2025-10-10T01:45:05.1948521Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool2d:0, line 1493 <- wrt source file 2025-10-10T01:45:05.1957448Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool2d:0 2025-10-10T01:45:05.1958841Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool3d:0, line 1533 <- wrt source file 2025-10-10T01:45:05.1980219Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool3d:0 2025-10-10T01:45:05.1981544Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::L1Loss:0, line 116 <- wrt source file 2025-10-10T01:45:05.1988933Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::L1Loss:0 2025-10-10T01:45:05.1990159Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::NLLLoss:0, line 216 <- wrt source file 2025-10-10T01:45:05.2023486Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::NLLLoss:0 2025-10-10T01:45:05.2025112Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::PoissonNLLLoss:0, line 330 <- wrt source file 2025-10-10T01:45:05.2031530Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::PoissonNLLLoss:0 2025-10-10T01:45:05.2032873Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::GaussianNLLLoss:0, line 419 <- wrt source file 2025-10-10T01:45:05.2048369Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::GaussianNLLLoss:0 2025-10-10T01:45:05.2049663Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::KLDivLoss:0, line 536 <- wrt source file 2025-10-10T01:45:05.2059179Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::KLDivLoss:0 2025-10-10T01:45:05.2061009Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MSELoss:0, line 618 <- wrt source file 2025-10-10T01:45:05.2066676Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MSELoss:0 2025-10-10T01:45:05.2068018Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCELoss:0, line 704 <- wrt source file 2025-10-10T01:45:05.2075036Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCELoss:0 2025-10-10T01:45:05.2077229Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:0, line 779 <- wrt source file 2025-10-10T01:45:05.2088740Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:0 2025-10-10T01:45:05.2090620Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:1, line 827 <- wrt source file 2025-10-10T01:45:05.2096571Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:1 2025-10-10T01:45:05.2098416Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiLabelMarginLoss:0, line 975 <- wrt source file 2025-10-10T01:45:05.2106634Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiLabelMarginLoss:0 2025-10-10T01:45:05.2108492Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:0, line 1307 <- wrt source file 2025-10-10T01:45:05.2118598Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:0 2025-10-10T01:45:05.2119857Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:1, line 1334 <- wrt source file 2025-10-10T01:45:05.2122116Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:1 2025-10-10T01:45:05.2123358Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CosineEmbeddingLoss:0, line 1496 <- wrt source file 2025-10-10T01:45:05.2132270Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CosineEmbeddingLoss:0 2025-10-10T01:45:05.2133530Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MarginRankingLoss:0, line 1563 <- wrt source file 2025-10-10T01:45:05.2140829Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MarginRankingLoss:0 2025-10-10T01:45:05.2142324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiMarginLoss:0, line 1644 <- wrt source file 2025-10-10T01:45:05.2149609Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiMarginLoss:0 2025-10-10T01:45:05.2150963Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginLoss:0, line 1746 <- wrt source file 2025-10-10T01:45:05.2164592Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginLoss:0 2025-10-10T01:45:05.2166157Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginWithDistanceLoss:0, line 1859 <- wrt source file 2025-10-10T01:45:05.2188080Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginWithDistanceLoss:0 2025-10-10T01:45:05.2190090Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CTCLoss:0, line 1991 <- wrt source file 2025-10-10T01:45:05.2216356Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CTCLoss:0 2025-10-10T01:45:05.2217811Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNN:0, line 597 <- wrt source file 2025-10-10T01:45:05.2232470Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNN:0 2025-10-10T01:45:05.2234219Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTM:0, line 966 <- wrt source file 2025-10-10T01:45:05.2421129Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTM:0 2025-10-10T01:45:05.2422545Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRU:0, line 1313 <- wrt source file 2025-10-10T01:45:05.2440831Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRU:0 2025-10-10T01:45:05.2442191Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNNCell:0, line 1573 <- wrt source file 2025-10-10T01:45:05.2455168Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNNCell:0 2025-10-10T01:45:05.2456604Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTMCell:0, line 1695 <- wrt source file 2025-10-10T01:45:05.2467481Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTMCell:0 2025-10-10T01:45:05.2468900Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRUCell:0, line 1809 <- wrt source file 2025-10-10T01:45:05.2484364Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRUCell:0 2025-10-10T01:45:05.2485854Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::Upsample:0, line 77 <- wrt source file 2025-10-10T01:45:05.2510452Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::Upsample:0 2025-10-10T01:45:05.2512072Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingNearest2d:0, line 229 <- wrt source file 2025-10-10T01:45:05.2524381Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingNearest2d:0 2025-10-10T01:45:05.2526076Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingBilinear2d:0, line 279 <- wrt source file 2025-10-10T01:45:05.2535266Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingBilinear2d:0 2025-10-10T01:45:05.2536789Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/bias.py::CausalBias:0, line 95 <- wrt source file 2025-10-10T01:45:05.2538216Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/bias.py::CausalBias:0 2025-10-10T01:45:05.2539824Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv2d_weight_memory_format:0, line 64 <- wrt source file 2025-10-10T01:45:05.2541627Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv2d_weight_memory_format:0 2025-10-10T01:45:05.2543580Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv3d_weight_memory_format:0, line 143 <- wrt source file 2025-10-10T01:45:05.2545370Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv3d_weight_memory_format:0 2025-10-10T01:45:05.2546970Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/stateless.py::functional_call:0, line 196 <- wrt source file 2025-10-10T01:45:05.2548275Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/stateless.py::functional_call:0 2025-10-10T01:45:05.2549649Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::identity:0, line 852 <- wrt source file 2025-10-10T01:45:05.2550611Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::identity:0 2025-10-10T01:45:05.2551623Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_unstructured:0, line 888 <- wrt source file 2025-10-10T01:45:05.2552675Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_unstructured:0 2025-10-10T01:45:05.2553686Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::l1_unstructured:0, line 931 <- wrt source file 2025-10-10T01:45:05.2554774Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::l1_unstructured:0 2025-10-10T01:45:05.2555787Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_structured:0, line 971 <- wrt source file 2025-10-10T01:45:05.2556831Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_structured:0 2025-10-10T01:45:05.2557830Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::ln_structured:0, line 1017 <- wrt source file 2025-10-10T01:45:05.2563184Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::ln_structured:0 2025-10-10T01:45:05.2564194Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::global_unstructured:0, line 1070 <- wrt source file 2025-10-10T01:45:05.2580863Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::global_unstructured:0 2025-10-10T01:45:05.2581860Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::custom_from_mask:0, line 1173 <- wrt source file 2025-10-10T01:45:05.2590734Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::custom_from_mask:0 2025-10-10T01:45:05.2591682Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::remove:0, line 1201 <- wrt source file 2025-10-10T01:45:05.2597889Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::remove:0 2025-10-10T01:45:05.2598815Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::is_pruned:0, line 1229 <- wrt source file 2025-10-10T01:45:05.2606802Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::is_pruned:0 2025-10-10T01:45:05.2607783Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::spectral_norm:0, line 314 <- wrt source file 2025-10-10T01:45:05.2614293Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::spectral_norm:0 2025-10-10T01:45:05.2615344Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::remove_spectral_norm:0, line 347 <- wrt source file 2025-10-10T01:45:05.2621400Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::remove_spectral_norm:0 2025-10-10T01:45:05.2622519Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::orthogonal:0, line 266 <- wrt source file 2025-10-10T01:45:05.2623584Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::orthogonal:0 2025-10-10T01:45:05.2624756Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::weight_norm:0, line 361 <- wrt source file 2025-10-10T01:45:05.2632216Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::weight_norm:0 2025-10-10T01:45:05.2633305Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::spectral_norm:0, line 592 <- wrt source file 2025-10-10T01:45:05.2634487Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::spectral_norm:0 2025-10-10T01:45:05.2635456Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/init.py::skip_init:0, line 33 <- wrt source file 2025-10-10T01:45:05.2648358Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/init.py::skip_init:0 2025-10-10T01:45:05.2649320Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_packed_sequence:0, line 359 <- wrt source file 2025-10-10T01:45:05.2665741Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_packed_sequence:0 2025-10-10T01:45:05.2666679Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_sequence:0, line 439 <- wrt source file 2025-10-10T01:45:05.2673165Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_sequence:0 2025-10-10T01:45:05.2674908Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpad_sequence:0, line 500 <- wrt source file 2025-10-10T01:45:05.2947173Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpad_sequence:0 2025-10-10T01:45:05.2948937Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pack_sequence:0, line 556 <- wrt source file 2025-10-10T01:45:05.2955989Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pack_sequence:0 2025-10-10T01:45:05.2957729Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpack_sequence:0, line 584 <- wrt source file 2025-10-10T01:45:05.2974753Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpack_sequence:0 2025-10-10T01:45:05.2985249Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_per_sample_grad.py::call_for_per_sample_grads:0, line 35 <- wrt source file 2025-10-10T01:45:05.2987070Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_per_sample_grad.py::call_for_per_sample_grads:0 2025-10-10T01:45:05.2988615Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::weight_norm:0, line 134 <- wrt source file 2025-10-10T01:45:05.2993721Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::weight_norm:0 2025-10-10T01:45:05.2995123Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::remove_weight_norm:0, line 156 <- wrt source file 2025-10-10T01:45:05.3001194Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::remove_weight_norm:0 2025-10-10T01:45:05.3002546Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/conv_utils.py::unfold3d:0, line 316 <- wrt source file 2025-10-10T01:45:05.3003923Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/conv_utils.py::unfold3d:0 2025-10-10T01:45:05.3005646Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/expanded_weights_utils.py::sum_over_all_but_batch_and_last_n:0, line 179 <- wrt source file 2025-10-10T01:45:05.3008739Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/expanded_weights_utils.py::sum_over_all_but_batch_and_last_n:0 2025-10-10T01:45:05.3010285Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py::DataParallel:0, line 127 <- wrt source file 2025-10-10T01:45:05.3011602Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py::DataParallel:0 2025-10-10T01:45:05.3012968Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel:0, line 644 <- wrt source file 2025-10-10T01:45:05.3014423Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel:0 2025-10-10T01:45:05.3015899Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.no_sync:0, line 1452 <- wrt source file 2025-10-10T01:45:05.3017411Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.no_sync:0 2025-10-10T01:45:05.3018845Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.join:0, line 1839 <- wrt source file 2025-10-10T01:45:05.3020042Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.join:0 2025-10-10T01:45:05.3021274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:0, line 2005 <- wrt source file 2025-10-10T01:45:05.3022578Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:0 2025-10-10T01:45:05.3023853Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:1, line 2015 <- wrt source file 2025-10-10T01:45:05.3025145Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:1 2025-10-10T01:45:05.3026442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_builtin_comm_hook:0, line 2050 <- wrt source file 2025-10-10T01:45:05.3027778Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_builtin_comm_hook:0 2025-10-10T01:45:05.3029093Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_fused_optim:0, line 2108 <- wrt source file 2025-10-10T01:45:05.3030394Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_fused_optim:0 2025-10-10T01:45:05.3031616Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/run.py::__doc__:0, line 57 <- wrt source file 2025-10-10T01:45:05.3032546Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/run.py::__doc__:0 2025-10-10T01:45:05.3033453Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/launch.py::__doc__:0, line 84 <- wrt source file 2025-10-10T01:45:05.3034466Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/launch.py::__doc__:0 2025-10-10T01:45:05.3046457Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_coalescing_manager:0, line 2603 <- wrt source file 2025-10-10T01:45:05.3047610Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_coalescing_manager:0 2025-10-10T01:45:05.3048717Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_time_estimator:0, line 2705 <- wrt source file 2025-10-10T01:45:05.3049837Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_time_estimator:0 2025-10-10T01:45:05.3050938Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::batch_isend_irecv:0, line 2752 <- wrt source file 2025-10-10T01:45:05.3052069Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::batch_isend_irecv:0 2025-10-10T01:45:05.3053124Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_reduce:0, line 2889 <- wrt source file 2025-10-10T01:45:05.3054193Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_reduce:0 2025-10-10T01:45:05.3055255Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_object:0, line 3172 <- wrt source file 2025-10-10T01:45:05.3056359Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_object:0 2025-10-10T01:45:05.3057425Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather_object:0, line 3276 <- wrt source file 2025-10-10T01:45:05.3058507Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather_object:0 2025-10-10T01:45:05.3059570Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::send_object_list:0, line 3407 <- wrt source file 2025-10-10T01:45:05.3060682Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::send_object_list:0 2025-10-10T01:45:05.3061763Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::recv_object_list:0, line 3524 <- wrt source file 2025-10-10T01:45:05.3062857Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::recv_object_list:0 2025-10-10T01:45:05.3063953Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::broadcast_object_list:0, line 3670 <- wrt source file 2025-10-10T01:45:05.3065100Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::broadcast_object_list:0 2025-10-10T01:45:05.3066395Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter_object_list:0, line 3795 <- wrt source file 2025-10-10T01:45:05.3067567Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter_object_list:0 2025-10-10T01:45:05.3068655Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather:0, line 3898 <- wrt source file 2025-10-10T01:45:05.3069731Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather:0 2025-10-10T01:45:05.3070963Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_into_tensor:0, line 4005 <- wrt source file 2025-10-10T01:45:05.3072219Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_into_tensor:0 2025-10-10T01:45:05.3073345Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_coalesced:0, line 4143 <- wrt source file 2025-10-10T01:45:05.3074694Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_coalesced:0 2025-10-10T01:45:05.3075771Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather:0, line 4249 <- wrt source file 2025-10-10T01:45:05.3076843Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather:0 2025-10-10T01:45:05.3077871Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter:0, line 4334 <- wrt source file 2025-10-10T01:45:05.3078897Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter:0 2025-10-10T01:45:05.3079966Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::reduce_scatter_tensor:0, line 4472 <- wrt source file 2025-10-10T01:45:05.3081105Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::reduce_scatter_tensor:0 2025-10-10T01:45:05.3082195Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all_single:0, line 4614 <- wrt source file 2025-10-10T01:45:05.3083294Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all_single:0 2025-10-10T01:45:05.3084346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all:0, line 4748 <- wrt source file 2025-10-10T01:45:05.3085398Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all:0 2025-10-10T01:45:05.3086467Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::monitored_barrier:0, line 4959 <- wrt source file 2025-10-10T01:45:05.3087588Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::monitored_barrier:0 2025-10-10T01:45:05.3088673Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups:0, line 5506 <- wrt source file 2025-10-10T01:45:05.3089762Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups:0 2025-10-10T01:45:05.3090885Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups_by_enumeration:0, line 5600 <- wrt source file 2025-10-10T01:45:05.3092259Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups_by_enumeration:0 2025-10-10T01:45:05.3093348Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh:0, line 414 <- wrt source file 2025-10-10T01:45:05.3094371Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh:0 2025-10-10T01:45:05.3095440Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh.get_local_rank:0, line 1035 <- wrt source file 2025-10-10T01:45:05.3096753Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh.get_local_rank:0 2025-10-10T01:45:05.3097818Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::init_device_mesh:0, line 1181 <- wrt source file 2025-10-10T01:45:05.3098882Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::init_device_mesh:0 2025-10-10T01:45:05.3099952Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.composition:0, line 116 <- wrt source file 2025-10-10T01:45:05.3101060Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.composition:0 2025-10-10T01:45:05.3102162Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.complement:0, line 133 <- wrt source file 2025-10-10T01:45:05.3103276Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.complement:0 2025-10-10T01:45:05.3104392Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.remap_to_tensor:0, line 271 <- wrt source file 2025-10-10T01:45:05.3105533Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.remap_to_tensor:0 2025-10-10T01:45:05.3106606Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/autograd/__init__.py::context:0, line 47 <- wrt source file 2025-10-10T01:45:05.3107635Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/autograd/__init__.py::context:0 2025-10-10T01:45:05.3108697Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_tools/memory_tracker.py::MemoryTracker:0, line 55 <- wrt source file 2025-10-10T01:45:05.3109826Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_tools/memory_tracker.py::MemoryTracker:0 2025-10-10T01:45:05.3110874Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/join.py::Join:0, line 141 <- wrt source file 2025-10-10T01:45:05.3111890Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/join.py::Join:0 2025-10-10T01:45:05.3113002Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/__init__.py::register_ddp_comm_hook:0, line 137 <- wrt source file 2025-10-10T01:45:05.3114352Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/__init__.py::register_ddp_comm_hook:0 2025-10-10T01:45:05.3115748Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py::HierarchicalModelAverager:0, line 54 <- wrt source file 2025-10-10T01:45:05.3117419Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py::HierarchicalModelAverager:0 2025-10-10T01:45:05.3118841Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/averagers.py::PeriodicModelAverager:0, line 57 <- wrt source file 2025-10-10T01:45:05.3120173Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/averagers.py::PeriodicModelAverager:0 2025-10-10T01:45:05.3121613Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::powerSGD_hook:0, line 395 <- wrt source file 2025-10-10T01:45:05.3122884Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::powerSGD_hook:0 2025-10-10T01:45:05.3124154Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::batched_powerSGD_hook:0, line 708 <- wrt source file 2025-10-10T01:45:05.3125477Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::batched_powerSGD_hook:0 2025-10-10T01:45:05.3126742Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::allreduce_hook:0, line 51 <- wrt source file 2025-10-10T01:45:05.3128058Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::allreduce_hook:0 2025-10-10T01:45:05.3129324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_hook:0, line 110 <- wrt source file 2025-10-10T01:45:05.3130631Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_hook:0 2025-10-10T01:45:05.3131908Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_hook:0, line 131 <- wrt source file 2025-10-10T01:45:05.3133191Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_hook:0 2025-10-10T01:45:05.3134477Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_wrapper:0, line 149 <- wrt source file 2025-10-10T01:45:05.3135798Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_wrapper:0 2025-10-10T01:45:05.3137083Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_wrapper:0, line 188 <- wrt source file 2025-10-10T01:45:05.3138389Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_wrapper:0 2025-10-10T01:45:05.3139639Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py::noop_hook:0, line 23 <- wrt source file 2025-10-10T01:45:05.3140881Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py::noop_hook:0 2025-10-10T01:45:05.3142141Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py::post_localSGD_hook:0, line 91 <- wrt source file 2025-10-10T01:45:05.3143624Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py::post_localSGD_hook:0 2025-10-10T01:45:05.3144988Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_pertensor_hook:0, line 64 <- wrt source file 2025-10-10T01:45:05.3146413Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_pertensor_hook:0 2025-10-10T01:45:05.3147996Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_perchannel_hook:0, line 146 <- wrt source file 2025-10-10T01:45:05.3149440Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_perchannel_hook:0 2025-10-10T01:45:05.3150808Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/server_process_global_profiler.py::_server_process_global_profile:0, line 62 <- wrt source file 2025-10-10T01:45:05.3152153Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/server_process_global_profiler.py::_server_process_global_profile:0 2025-10-10T01:45:05.3153282Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::_wait_all:0, line 174 <- wrt source file 2025-10-10T01:45:05.3154314Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::_wait_all:0 2025-10-10T01:45:05.3155266Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::shutdown:0, line 345 <- wrt source file 2025-10-10T01:45:05.3156249Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::shutdown:0 2025-10-10T01:45:05.3157179Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::remote:0, line 607 <- wrt source file 2025-10-10T01:45:05.3158124Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::remote:0 2025-10-10T01:45:05.3159055Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_sync:0, line 788 <- wrt source file 2025-10-10T01:45:05.3160014Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_sync:0 2025-10-10T01:45:05.3160948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_async:0, line 880 <- wrt source file 2025-10-10T01:45:05.3161911Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_async:0 2025-10-10T01:45:05.3162902Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/functions.py::async_execution:0, line 34 <- wrt source file 2025-10-10T01:45:05.3163995Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/functions.py::async_execution:0 2025-10-10T01:45:05.3165186Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/options.py::TensorPipeRpcBackendOptions.set_device_map:0, line 126 <- wrt source file 2025-10-10T01:45:05.3166495Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/options.py::TensorPipeRpcBackendOptions.set_device_map:0 2025-10-10T01:45:05.3167624Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/_IR.py::pipe_split:0, line 345 <- wrt source file 2025-10-10T01:45:05.3168830Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/_IR.py::pipe_split:0 2025-10-10T01:45:05.3169919Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::_CustomReducer:0, line 35 <- wrt source file 2025-10-10T01:45:05.3171085Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::_CustomReducer:0 2025-10-10T01:45:05.3172423Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_tuple:0, line 84 <- wrt source file 2025-10-10T01:45:05.3173671Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_tuple:0 2025-10-10T01:45:05.3174905Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_dict:0, line 103 <- wrt source file 2025-10-10T01:45:05.3176153Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_dict:0 2025-10-10T01:45:05.3177396Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_optim/__init__.py::named_params_with_sharded_tensor:0, line 31 <- wrt source file 2025-10-10T01:45:05.3178685Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_optim/__init__.py::named_params_with_sharded_tensor:0 2025-10-10T01:45:05.3179924Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::init_from_local_shards:0, line 384 <- wrt source file 2025-10-10T01:45:05.3181157Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::init_from_local_shards:0 2025-10-10T01:45:05.3182355Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::custom_sharded_op_impl:0, line 457 <- wrt source file 2025-10-10T01:45:05.3183580Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::custom_sharded_op_impl:0 2025-10-10T01:45:05.3184737Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharding_plan/api.py::ShardingPlan:0, line 36 <- wrt source file 2025-10-10T01:45:05.3185866Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharding_plan/api.py::ShardingPlan:0 2025-10-10T01:45:05.3187080Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor._init_from_local_tensor:0, line 858 <- wrt source file 2025-10-10T01:45:05.3188413Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor._init_from_local_tensor:0 2025-10-10T01:45:05.3189654Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor.reshard:0, line 1096 <- wrt source file 2025-10-10T01:45:05.3190883Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor.reshard:0 2025-10-10T01:45:05.3192078Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/_ops/_common.py::_sharded_op_common:0, line 18 <- wrt source file 2025-10-10T01:45:05.3193309Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/_ops/_common.py::_sharded_op_common:0 2025-10-10T01:45:05.3194881Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/post_localSGD_optimizer.py::PostLocalSGDOptimizer:0, line 19 <- wrt source file 2025-10-10T01:45:05.3196175Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/post_localSGD_optimizer.py::PostLocalSGDOptimizer:0 2025-10-10T01:45:05.3197350Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py::_NamedOptimizer:0, line 43 <- wrt source file 2025-10-10T01:45:05.3198664Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py::_NamedOptimizer:0 2025-10-10T01:45:05.3199874Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_apply_optimizer_in_backward:0, line 43 <- wrt source file 2025-10-10T01:45:05.3201221Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_apply_optimizer_in_backward:0 2025-10-10T01:45:05.3202512Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_get_in_backward_optimizers:0, line 114 <- wrt source file 2025-10-10T01:45:05.3203827Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_get_in_backward_optimizers:0 2025-10-10T01:45:05.3205022Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/utils.py::register_functional_optim:0, line 37 <- wrt source file 2025-10-10T01:45:05.3206146Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/utils.py::register_functional_optim:0 2025-10-10T01:45:05.3207357Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/zero_redundancy_optimizer.py::ZeroRedundancyOptimizer:0, line 341 <- wrt source file 2025-10-10T01:45:05.3208654Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/zero_redundancy_optimizer.py::ZeroRedundancyOptimizer:0 2025-10-10T01:45:05.3209850Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/optimizer.py::DistributedOptimizer:0, line 162 <- wrt source file 2025-10-10T01:45:05.3211016Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/optimizer.py::DistributedOptimizer:0 2025-10-10T01:45:05.3212116Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::put:0, line 275 <- wrt source file 2025-10-10T01:45:05.3213245Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::put:0 2025-10-10T01:45:05.3214329Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get:0, line 328 <- wrt source file 2025-10-10T01:45:05.3215441Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get:0 2025-10-10T01:45:05.3216552Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get_nbi:0, line 378 <- wrt source file 2025-10-10T01:45:05.3217705Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get_nbi:0 2025-10-10T01:45:05.3218882Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::putmem_signal_block:0, line 453 <- wrt source file 2025-10-10T01:45:05.3220270Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::putmem_signal_block:0 2025-10-10T01:45:05.3221454Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::wait_until:0, line 531 <- wrt source file 2025-10-10T01:45:05.3222627Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::wait_until:0 2025-10-10T01:45:05.3223932Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_wait_until:0, line 593 <- wrt source file 2025-10-10T01:45:05.3225152Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_wait_until:0 2025-10-10T01:45:05.3226324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_op:0, line 651 <- wrt source file 2025-10-10T01:45:05.3227490Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_op:0 2025-10-10T01:45:05.3228620Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::fence:0, line 704 <- wrt source file 2025-10-10T01:45:05.3229768Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::fence:0 2025-10-10T01:45:05.3230886Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::quiet:0, line 750 <- wrt source file 2025-10-10T01:45:05.3232034Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::quiet:0 2025-10-10T01:45:05.3233148Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::my_pe:0, line 794 <- wrt source file 2025-10-10T01:45:05.3234360Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::my_pe:0 2025-10-10T01:45:05.3235470Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::n_pes:0, line 837 <- wrt source file 2025-10-10T01:45:05.3236627Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::n_pes:0 2025-10-10T01:45:05.3237802Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::barrier_all:0, line 888 <- wrt source file 2025-10-10T01:45:05.3238998Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::barrier_all:0 2025-10-10T01:45:05.3240149Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::sync_all:0, line 934 <- wrt source file 2025-10-10T01:45:05.3241319Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::sync_all:0 2025-10-10T01:45:05.3242476Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::alltoall:0, line 973 <- wrt source file 2025-10-10T01:45:05.3243645Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::alltoall:0 2025-10-10T01:45:05.3244972Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::broadcast:0, line 1028 <- wrt source file 2025-10-10T01:45:05.3246167Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::broadcast:0 2025-10-10T01:45:05.3247309Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce:0, line 1089 <- wrt source file 2025-10-10T01:45:05.3248626Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce:0 2025-10-10T01:45:05.3249818Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce_extern_wrapper:0, line 1135 <- wrt source file 2025-10-10T01:45:05.3251111Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce_extern_wrapper:0 2025-10-10T01:45:05.3252350Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_random.py::OffsetBasedRNGTracker._set_pre_op_offset:0, line 295 <- wrt source file 2025-10-10T01:45:05.3253608Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_random.py::OffsetBasedRNGTracker._set_pre_op_offset:0 2025-10-10T01:45:05.3254716Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_api.py::_shard_tensor:0, line 848 <- wrt source file 2025-10-10T01:45:05.3255748Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_api.py::_shard_tensor:0 2025-10-10T01:45:05.3256819Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_ops/_common_rules.py::pointwise_rule:0, line 231 <- wrt source file 2025-10-10T01:45:05.3257972Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_ops/_common_rules.py::pointwise_rule:0 2025-10-10T01:45:05.3259183Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_register_sharding.py::register_sharding:0, line 47 <- wrt source file 2025-10-10T01:45:05.3260487Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_register_sharding.py::register_sharding:0 2025-10-10T01:45:05.3261701Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_func_map.py::local_map:0, line 103 <- wrt source file 2025-10-10T01:45:05.3262857Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_func_map.py::local_map:0 2025-10-10T01:45:05.3264117Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_LoadBalancer._generate_indices:0, line 28 <- wrt source file 2025-10-10T01:45:05.3265488Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_LoadBalancer._generate_indices:0 2025-10-10T01:45:05.3266868Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_HeadTailLoadBalancer._generate_indices:0, line 102 <- wrt source file 2025-10-10T01:45:05.3268334Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_HeadTailLoadBalancer._generate_indices:0 2025-10-10T01:45:05.3269948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_PerDocumentHeadTailLoadBalancer._generate_indices:0, line 213 <- wrt source file 2025-10-10T01:45:05.3271497Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_load_balancer.py::_PerDocumentHeadTailLoadBalancer._generate_indices:0 2025-10-10T01:45:05.3272833Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/ddp.py::_pre_dp_module_transform:0, line 88 <- wrt source file 2025-10-10T01:45:05.3274032Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/ddp.py::_pre_dp_module_transform:0 2025-10-10T01:45:05.3275567Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::ColwiseParallel:0, line 64 <- wrt source file 2025-10-10T01:45:05.3276736Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::ColwiseParallel:0 2025-10-10T01:45:05.3277883Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::RowwiseParallel:0, line 198 <- wrt source file 2025-10-10T01:45:05.3279046Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::RowwiseParallel:0 2025-10-10T01:45:05.3280191Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::SequenceParallel:0, line 350 <- wrt source file 2025-10-10T01:45:05.3281375Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::SequenceParallel:0 2025-10-10T01:45:05.3282536Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInput:0, line 452 <- wrt source file 2025-10-10T01:45:05.3283736Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInput:0 2025-10-10T01:45:05.3284902Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleOutput:0, line 618 <- wrt source file 2025-10-10T01:45:05.3286095Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleOutput:0 2025-10-10T01:45:05.3287295Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInputOutput:0, line 744 <- wrt source file 2025-10-10T01:45:05.3288566Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInputOutput:0 2025-10-10T01:45:05.3289746Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/api.py::parallelize_module:0, line 56 <- wrt source file 2025-10-10T01:45:05.3290907Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/api.py::parallelize_module:0 2025-10-10T01:45:05.3292020Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/loss.py::loss_parallel:0, line 56 <- wrt source file 2025-10-10T01:45:05.3293141Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/loss.py::loss_parallel:0 2025-10-10T01:45:05.3294191Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py::CustomPolicy:0, line 224 <- wrt source file 2025-10-10T01:45:05.3295217Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py::CustomPolicy:0 2025-10-10T01:45:05.3296504Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::MixedPrecision:0, line 202 <- wrt source file 2025-10-10T01:45:05.3297542Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::MixedPrecision:0 2025-10-10T01:45:05.3298543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::StateDictType:0, line 262 <- wrt source file 2025-10-10T01:45:05.3299552Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::StateDictType:0 2025-10-10T01:45:05.3300903Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel:0, line 125 <- wrt source file 2025-10-10T01:45:05.3302203Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel:0 2025-10-10T01:45:05.3303557Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.set_state_dict_type:0, line 651 <- wrt source file 2025-10-10T01:45:05.3305014Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.set_state_dict_type:0 2025-10-10T01:45:05.3306423Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.state_dict_type:0, line 798 <- wrt source file 2025-10-10T01:45:05.3307852Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.state_dict_type:0 2025-10-10T01:45:05.3309291Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.shard_full_optim_state_dict:0, line 1490 <- wrt source file 2025-10-10T01:45:05.3310790Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.shard_full_optim_state_dict:0 2025-10-10T01:45:05.3312280Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.scatter_full_optim_state_dict:0, line 1610 <- wrt source file 2025-10-10T01:45:05.3313803Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.scatter_full_optim_state_dict:0 2025-10-10T01:45:05.3315371Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.rekey_optim_state_dict:0, line 1695 <- wrt source file 2025-10-10T01:45:05.3316887Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.rekey_optim_state_dict:0 2025-10-10T01:45:05.3318326Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict:0, line 1824 <- wrt source file 2025-10-10T01:45:05.3319773Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict:0 2025-10-10T01:45:05.3321203Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict_to_load:0, line 1911 <- wrt source file 2025-10-10T01:45:05.3322851Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict_to_load:0 2025-10-10T01:45:05.3324157Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/sharded_grad_scaler.py::ShardedGradScaler:0, line 54 <- wrt source file 2025-10-10T01:45:05.3325372Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/sharded_grad_scaler.py::ShardedGradScaler:0 2025-10-10T01:45:05.3326483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/functional.py::_all_gather_base:0, line 134 <- wrt source file 2025-10-10T01:45:05.3327721Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/functional.py::_all_gather_base:0 2025-10-10T01:45:05.3328844Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.__init__:0, line 196 <- wrt source file 2025-10-10T01:45:05.3330022Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.__init__:0 2025-10-10T01:45:05.3331214Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.init_from_module_rref:0, line 532 <- wrt source file 2025-10-10T01:45:05.3332485Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.init_from_module_rref:0 2025-10-10T01:45:05.3333644Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::RemoteModule:0, line 663 <- wrt source file 2025-10-10T01:45:05.3334747Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::RemoteModule:0 2025-10-10T01:45:05.3335871Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/checkpoint_activation.py::checkpoint:0, line 53 <- wrt source file 2025-10-10T01:45:05.3337063Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/checkpoint_activation.py::checkpoint:0 2025-10-10T01:45:05.3338154Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/contract.py::contract:0, line 67 <- wrt source file 2025-10-10T01:45:05.3339233Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/contract.py::contract:0 2025-10-10T01:45:05.3340313Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate_with_fsdp.py::replicate:0, line 251 <- wrt source file 2025-10-10T01:45:05.3341476Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate_with_fsdp.py::replicate:0 2025-10-10T01:45:05.3342576Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate.py::replicate:0, line 190 <- wrt source file 2025-10-10T01:45:05.3343654Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate.py::replicate:0 2025-10-10T01:45:05.3344803Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/events/__init__.py::construct_and_record_rdzv_event:0, line 110 <- wrt source file 2025-10-10T01:45:05.3346092Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/events/__init__.py::construct_and_record_rdzv_event:0 2025-10-10T01:45:05.3347329Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/rendezvous/api.py::RendezvousHandler.shutdown:0, line 232 <- wrt source file 2025-10-10T01:45:05.3348722Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/rendezvous/api.py::RendezvousHandler.shutdown:0 2025-10-10T01:45:05.3349918Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/utils/distributed.py::get_free_port:0, line 141 <- wrt source file 2025-10-10T01:45:05.3351077Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/utils/distributed.py::get_free_port:0 2025-10-10T01:45:05.3352335Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::get_state_dict:0, line 1146 <- wrt source file 2025-10-10T01:45:05.3353465Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::get_state_dict:0 2025-10-10T01:45:05.3354678Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_model_state_dict:0, line 1398 <- wrt source file 2025-10-10T01:45:05.3355874Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_model_state_dict:0 2025-10-10T01:45:05.3357065Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_optimizer_state_dict:0, line 1457 <- wrt source file 2025-10-10T01:45:05.3358314Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_optimizer_state_dict:0 2025-10-10T01:45:05.3359543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::BroadcastingTorchSaveReader:0, line 49 <- wrt source file 2025-10-10T01:45:05.3360832Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::BroadcastingTorchSaveReader:0 2025-10-10T01:45:05.3362054Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::DynamicMetaLoadPlanner:0, line 164 <- wrt source file 2025-10-10T01:45:05.3363279Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::DynamicMetaLoadPlanner:0 2025-10-10T01:45:05.3364402Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::save:0, line 159 <- wrt source file 2025-10-10T01:45:05.3365497Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::save:0 2025-10-10T01:45:05.3366588Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::async_save:0, line 273 <- wrt source file 2025-10-10T01:45:05.3367733Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::async_save:0 2025-10-10T01:45:05.3368828Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_loader.py::load:0, line 131 <- wrt source file 2025-10-10T01:45:05.3369925Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_loader.py::load:0 2025-10-10T01:45:05.3371091Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/optimizer.py::load_sharded_optimizer_state_dict:0, line 228 <- wrt source file 2025-10-10T01:45:05.3372360Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/optimizer.py::load_sharded_optimizer_state_dict:0 2025-10-10T01:45:05.3373790Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer:0, line 104 <- wrt source file 2025-10-10T01:45:05.3375088Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer:0 2025-10-10T01:45:05.3376360Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer.save:0, line 142 <- wrt source file 2025-10-10T01:45:05.3377976Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer.save:0 2025-10-10T01:45:05.3379258Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer:0, line 213 <- wrt source file 2025-10-10T01:45:05.3380555Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer:0 2025-10-10T01:45:05.3381839Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer.save:0, line 260 <- wrt source file 2025-10-10T01:45:05.3383168Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer.save:0 2025-10-10T01:45:05.3384419Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/barriers.py::BarrierConfig:0, line 50 <- wrt source file 2025-10-10T01:45:05.3385643Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/barriers.py::BarrierConfig:0 2025-10-10T01:45:05.3386866Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_sync_checkpointer:0, line 78 <- wrt source file 2025-10-10T01:45:05.3388132Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_sync_checkpointer:0 2025-10-10T01:45:05.3389372Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_async_checkpointer:0, line 139 <- wrt source file 2025-10-10T01:45:05.3390652Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_async_checkpointer:0 2025-10-10T01:45:05.3391891Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/staging.py::DefaultStager.close:0, line 208 <- wrt source file 2025-10-10T01:45:05.3393157Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/staging.py::DefaultStager.close:0 2025-10-10T01:45:05.3394385Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/package/glob_group.py::GlobGroup:0, line 22 <- wrt source file 2025-10-10T01:45:05.3395368Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/package/glob_group.py::GlobGroup:0 2025-10-10T01:45:05.3396324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_equal:0, line 171 <- wrt source file 2025-10-10T01:45:05.3397323Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_equal:0 2025-10-10T01:45:05.3398301Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::print_assert_equal:0, line 302 <- wrt source file 2025-10-10T01:45:05.3399499Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::print_assert_equal:0 2025-10-10T01:45:05.3400520Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_almost_equal:0, line 375 <- wrt source file 2025-10-10T01:45:05.3401562Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_almost_equal:0 2025-10-10T01:45:05.3402570Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_approx_equal:0, line 496 <- wrt source file 2025-10-10T01:45:05.3403766Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_approx_equal:0 2025-10-10T01:45:05.3404766Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_equal:0, line 793 <- wrt source file 2025-10-10T01:45:05.3423789Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_equal:0 2025-10-10T01:45:05.3424855Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal:0, line 899 <- wrt source file 2025-10-10T01:45:05.3487850Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal:0 2025-10-10T01:45:05.3489727Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_less:0, line 1008 <- wrt source file 2025-10-10T01:45:05.3542400Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_less:0 2025-10-10T01:45:05.3544207Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_string_equal:0, line 1073 <- wrt source file 2025-10-10T01:45:05.3546051Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_string_equal:0 2025-10-10T01:45:05.3547780Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_allclose:0, line 1294 <- wrt source file 2025-10-10T01:45:05.3562103Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_allclose:0 2025-10-10T01:45:05.3563193Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal_nulp:0, line 1360 <- wrt source file 2025-10-10T01:45:05.3566761Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal_nulp:0 2025-10-10T01:45:05.3567860Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_max_ulp:0, line 1423 <- wrt source file 2025-10-10T01:45:05.3571662Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_max_ulp:0 2025-10-10T01:45:05.3572685Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::nulp_diff:0, line 1468 <- wrt source file 2025-10-10T01:45:05.3573669Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::nulp_diff:0 2025-10-10T01:45:05.3574663Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_warns:0, line 1578 <- wrt source file 2025-10-10T01:45:05.3578691Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_warns:0 2025-10-10T01:45:05.3579927Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::clear_and_catch_warnings:0, line 1881 <- wrt source file 2025-10-10T01:45:05.3581556Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::clear_and_catch_warnings:0 2025-10-10T01:45:05.3585145Z ============ 2025-10-10T01:45:05.3585407Z Finished doctests 2025-10-10T01:45:05.3585603Z 376 / 877 passed 2025-10-10T01:45:05.3585809Z  2025-10-10T01:45:05.3586063Z === Found 18 parse-time warnings === 2025-10-10T01:45:05.3586605Z --- Parse Warning: 1 / 18 --- 2025-10-10T01:45:05.3587864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=Library.fallback in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=367. 2025-10-10T01:45:05.3588873Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3589370Z Registers the function implementation as the fallback for the given key. 2025-10-10T01:45:05.3589728Z 2025-10-10T01:45:05.3590003Z This function only works for a library with global namespace ("_"). 2025-10-10T01:45:05.3590343Z 2025-10-10T01:45:05.3590506Z Args: 2025-10-10T01:45:05.3590852Z fn: function used as fallback for the given dispatch key or :func:`~fallthrough_kernel` 2025-10-10T01:45:05.3591269Z to register a fallthrough. 2025-10-10T01:45:05.3591721Z dispatch_key: dispatch key that the input function should be registered for. By default, it uses 2025-10-10T01:45:05.3592231Z the dispatch key that the library was created with. 2025-10-10T01:45:05.3592789Z with_keyset: flag controlling if the current dispatcher call keyset should be passed as the first argument 2025-10-10T01:45:05.3593457Z to :attr:`fn` when calling. This should be used to create the appropriate keyset for redispatch calls. 2025-10-10T01:45:05.3593881Z 2025-10-10T01:45:05.3594059Z Example:: 2025-10-10T01:45:05.3594343Z 2025-10-10T01:45:05.3594529Z >>> my_lib = Library("_", "IMPL") 2025-10-10T01:45:05.3594828Z >>> def fallback_kernel(op, *args, **kwargs): 2025-10-10T01:45:05.3595137Z >>> # Handle all autocast ops generically 2025-10-10T01:45:05.3595406Z >>> # ... 2025-10-10T01:45:05.3595662Z >>> my_lib.fallback(fallback_kernel, "Autocast") 2025-10-10T01:45:05.3595948Z 2025-10-10T01:45:05.3596555Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 5, 1, 'my_lib.fallback(fallback_kernel, "Autocast")\n', 5, 7)) 2025-10-10T01:45:05.3597223Z 2025-10-10T01:45:05.3597424Z my_lib.fallback(fallback_kernel, "Autocast") 2025-10-10T01:45:05.3597684Z ^ 2025-10-10T01:45:05.3597870Z warnings.warn(msg) 2025-10-10T01:45:05.3598079Z 2025-10-10T01:45:05.3598337Z --- Parse Warning: 2 / 18 --- 2025-10-10T01:45:05.3599257Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=942. 2025-10-10T01:45:05.3600230Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3600749Z Register a FakeTensor implementation ("fake impl") for this operator. 2025-10-10T01:45:05.3601136Z 2025-10-10T01:45:05.3601380Z Also sometimes known as a "meta kernel", "abstract impl". 2025-10-10T01:45:05.3601685Z 2025-10-10T01:45:05.3601973Z An "FakeTensor implementation" specifies the behavior of this operator on 2025-10-10T01:45:05.3602448Z Tensors that carry no data ("FakeTensor"). Given some input Tensors with 2025-10-10T01:45:05.3603087Z certain properties (sizes/strides/storage_offset/device), it specifies 2025-10-10T01:45:05.3603508Z what the properties of the output Tensors are. 2025-10-10T01:45:05.3603791Z 2025-10-10T01:45:05.3604071Z The FakeTensor implementation has the same signature as the operator. 2025-10-10T01:45:05.3604516Z It is run for both FakeTensors and meta tensors. To write a FakeTensor 2025-10-10T01:45:05.3604957Z implementation, assume that all Tensor inputs to the operator are 2025-10-10T01:45:05.3605391Z regular CPU/CUDA/Meta tensors, but they do not have storage, and 2025-10-10T01:45:05.3605974Z you are trying to return regular CPU/CUDA/Meta tensor(s) as output. 2025-10-10T01:45:05.3606416Z The FakeTensor implementation must consist of only PyTorch operations 2025-10-10T01:45:05.3606854Z (and may not directly access the storage or data of any input or 2025-10-10T01:45:05.3607183Z intermediate Tensors). 2025-10-10T01:45:05.3607407Z 2025-10-10T01:45:05.3607646Z This API may be used as a decorator (see examples). 2025-10-10T01:45:05.3607932Z 2025-10-10T01:45:05.3608140Z For a detailed guide on custom ops, please see 2025-10-10T01:45:05.3608550Z https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-10-10T01:45:05.3608895Z 2025-10-10T01:45:05.3609057Z Args: 2025-10-10T01:45:05.3609348Z op_name: Operator name (along with the overload) or OpOverload object. 2025-10-10T01:45:05.3609724Z func: Fake tensor implementation. 2025-10-10T01:45:05.3610077Z lib (Optional[Library]): Library to register the fake tensor to. 2025-10-10T01:45:05.3610481Z allow_override: Flag controlling if we want to override an 2025-10-10T01:45:05.3610870Z existing registered fake impl. This is by default off, 2025-10-10T01:45:05.3611249Z and will error you're trying to register a fake impl to 2025-10-10T01:45:05.3611631Z an operator that already has a fake impl. This also only 2025-10-10T01:45:05.3612000Z applies if the custom operator was not created via 2025-10-10T01:45:05.3612383Z torch.library.custom_op, as overriding and existing fake 2025-10-10T01:45:05.3612731Z impl is already allowed. 2025-10-10T01:45:05.3612980Z 2025-10-10T01:45:05.3613147Z Examples: 2025-10-10T01:45:05.3613349Z >>> import torch 2025-10-10T01:45:05.3613583Z >>> import numpy as np 2025-10-10T01:45:05.3613850Z >>> from torch import Tensor 2025-10-10T01:45:05.3614092Z >>> 2025-10-10T01:45:05.3614361Z >>> # Example 1: an operator without data-dependent output shape 2025-10-10T01:45:05.3614779Z >>> @torch.library.custom_op("mylib::custom_linear", mutates_args=()) 2025-10-10T01:45:05.3615220Z >>> def custom_linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-10-10T01:45:05.3615650Z >>> raise NotImplementedError("Implementation goes here") 2025-10-10T01:45:05.3615961Z >>> 2025-10-10T01:45:05.3616215Z >>> @torch.library.register_fake("mylib::custom_linear") 2025-10-10T01:45:05.3616538Z >>> def _(x, weight, bias): 2025-10-10T01:45:05.3616796Z >>> assert x.dim() == 2 2025-10-10T01:45:05.3617059Z >>> assert weight.dim() == 2 2025-10-10T01:45:05.3617334Z >>> assert bias.dim() == 1 2025-10-10T01:45:05.3617627Z >>> assert x.shape[1] == weight.shape[1] 2025-10-10T01:45:05.3617934Z >>> assert weight.shape[0] == bias.shape[0] 2025-10-10T01:45:05.3618240Z >>> assert x.device == weight.device 2025-10-10T01:45:05.3618505Z >>> 2025-10-10T01:45:05.3618715Z >>> return (x @ weight.t()) + bias 2025-10-10T01:45:05.3618973Z >>> 2025-10-10T01:45:05.3619228Z >>> with torch._subclasses.fake_tensor.FakeTensorMode(): 2025-10-10T01:45:05.3619703Z >>> x = torch.randn(2, 3) 2025-10-10T01:45:05.3620197Z >>> w = torch.randn(3, 3) 2025-10-10T01:45:05.3620458Z >>> b = torch.randn(3) 2025-10-10T01:45:05.3620740Z >>> y = torch.ops.mylib.custom_linear(x, w, b) 2025-10-10T01:45:05.3621023Z >>> 2025-10-10T01:45:05.3621222Z >>> assert y.shape == (2, 3) 2025-10-10T01:45:05.3621472Z >>> 2025-10-10T01:45:05.3621731Z >>> # Example 2: an operator with data-dependent output shape 2025-10-10T01:45:05.3622321Z >>> @torch.library.custom_op("mylib::custom_nonzero", mutates_args=()) 2025-10-10T01:45:05.3622702Z >>> def custom_nonzero(x: Tensor) -> Tensor: 2025-10-10T01:45:05.3623003Z >>> x_np = x.numpy(force=True) 2025-10-10T01:45:05.3623300Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-10-10T01:45:05.3623622Z >>> return torch.tensor(res, device=x.device) 2025-10-10T01:45:05.3623906Z >>> 2025-10-10T01:45:05.3624165Z >>> @torch.library.register_fake("mylib::custom_nonzero") 2025-10-10T01:45:05.3624475Z >>> def _(x): 2025-10-10T01:45:05.3624750Z >>> # Number of nonzero-elements is data-dependent. 2025-10-10T01:45:05.3625097Z >>> # Since we cannot peek at the data in an fake impl, 2025-10-10T01:45:05.3625453Z >>> # we use the ctx object to construct a new symint that 2025-10-10T01:45:05.3625795Z >>> # represents the data-dependent size. 2025-10-10T01:45:05.3626111Z >>> ctx = torch.library.get_ctx() 2025-10-10T01:45:05.3626403Z >>> nnz = ctx.new_dynamic_size() 2025-10-10T01:45:05.3626683Z >>> shape = [nnz, x.dim()] 2025-10-10T01:45:05.3626995Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-10-10T01:45:05.3627301Z >>> return result 2025-10-10T01:45:05.3627541Z >>> 2025-10-10T01:45:05.3627819Z >>> from torch.fx.experimental.proxy_tensor import make_fx 2025-10-10T01:45:05.3628136Z >>> 2025-10-10T01:45:05.3628355Z >>> x = torch.tensor([0, 1, 2, 3, 4, 0]) 2025-10-10T01:45:05.3628761Z >>> trace = make_fx(torch.ops.mylib.custom_nonzero, tracing_mode="symbolic")(x) 2025-10-10T01:45:05.3629167Z >>> trace.print_readable() 2025-10-10T01:45:05.3629420Z >>> 2025-10-10T01:45:05.3629722Z >>> assert torch.allclose(trace(x), torch.ops.mylib.custom_nonzero(x)) 2025-10-10T01:45:05.3630082Z 2025-10-10T01:45:05.3630256Z 2025-10-10T01:45:05.3630785Z Original Error: IndentationError('expected an indented block after function definition on line 37', ('', 38, 1, '_._ = None\n', 38, 2)) 2025-10-10T01:45:05.3631372Z 2025-10-10T01:45:05.3631547Z _._ = None 2025-10-10T01:45:05.3631734Z ^ 2025-10-10T01:45:05.3631926Z warnings.warn(msg) 2025-10-10T01:45:05.3632145Z 2025-10-10T01:45:05.3632411Z --- Parse Warning: 3 / 18 --- 2025-10-10T01:45:05.3633262Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=get_kernel in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=1476. 2025-10-10T01:45:05.3634299Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3634783Z Returns the computed kernel for a given operator and dispatch key. 2025-10-10T01:45:05.3635128Z 2025-10-10T01:45:05.3635405Z This function retrieves the kernel that would be executed for a given 2025-10-10T01:45:05.3635869Z operator and dispatch key combination. The returned SafeKernelFunction 2025-10-10T01:45:05.3636314Z can be used to call the kernel in a boxed fashion. The intended use 2025-10-10T01:45:05.3636742Z case for this function is to retrieve the original kernel for a given 2025-10-10T01:45:05.3637353Z dispatch key and then register another kernel to the same dispatch key 2025-10-10T01:45:05.3637773Z that calls into the original kernel for certain cases. 2025-10-10T01:45:05.3638080Z 2025-10-10T01:45:05.3638253Z Args: 2025-10-10T01:45:05.3638531Z op: Operator name (along with the overload) or OpOverload object 2025-10-10T01:45:05.3638982Z Can be a string (e.g., "aten::add.Tensor"), an OpOverload, or a CustomOpDef. 2025-10-10T01:45:05.3639478Z dispatch_key (str | torch.DispatchKey): The dispatch key to get the kernel for. 2025-10-10T01:45:05.3640111Z Can be a string (e.g., "CPU", "CUDA") or a DispatchKey enum value. 2025-10-10T01:45:05.3640429Z 2025-10-10T01:45:05.3640598Z Returns: 2025-10-10T01:45:05.3640923Z torch._C._SafeKernelFunction: A safe kernel function that can be used to 2025-10-10T01:45:05.3641302Z call the kernel. 2025-10-10T01:45:05.3641521Z 2025-10-10T01:45:05.3641693Z Raises: 2025-10-10T01:45:05.3641944Z RuntimeError: If the operator does not exist. 2025-10-10T01:45:05.3642230Z 2025-10-10T01:45:05.3642406Z Example: 2025-10-10T01:45:05.3642630Z >>> # Get the CPU kernel for torch.add 2025-10-10T01:45:05.3642989Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", "CPU") 2025-10-10T01:45:05.3643311Z >>> 2025-10-10T01:45:05.3643519Z >>> # You can also use DispatchKey enum 2025-10-10T01:45:05.3643920Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", torch.DispatchKey.CPU) 2025-10-10T01:45:05.3644309Z >>> 2025-10-10T01:45:05.3644511Z >>> # Or use an OpOverload directly 2025-10-10T01:45:05.3644881Z >>> kernel = torch.library.get_kernel(torch.ops.aten.add.Tensor, "CPU") 2025-10-10T01:45:05.3645242Z >>> 2025-10-10T01:45:05.3645526Z >>> # Example: Using get_kernel in a custom op with conditional dispatch 2025-10-10T01:45:05.3645918Z >>> # Get the original kernel for torch.sin 2025-10-10T01:45:05.3646295Z >>> original_sin_kernel = torch.library.get_kernel("aten::sin", "CPU") 2025-10-10T01:45:05.3646625Z >>> 2025-10-10T01:45:05.3646919Z >>> # If input has negative values, use original sin, otherwise return zeros 2025-10-10T01:45:05.3647322Z >>> def conditional_sin_impl(dispatch_keys, x): 2025-10-10T01:45:05.3647625Z >>> if (x < 0).any(): 2025-10-10T01:45:05.3647946Z >>> return original_sin_kernel.call_boxed(dispatch_keys, x) 2025-10-10T01:45:05.3648278Z >>> else: 2025-10-10T01:45:05.3648512Z >>> return torch.zeros_like(x) 2025-10-10T01:45:05.3648768Z >>> 2025-10-10T01:45:05.3648998Z >>> lib = torch.library.Library("aten", "IMPL") 2025-10-10T01:45:05.3649424Z >>> # with_keyset=True so the first argument to the impl is the current DispatchKeySet 2025-10-10T01:45:05.3649903Z >>> which needs to be the first argument to ``kernel.call_boxed`` 2025-10-10T01:45:05.3650305Z >>> lib.impl("sin", conditional_sin_impl, "CPU", with_keyset=True) 2025-10-10T01:45:05.3650637Z >>> 2025-10-10T01:45:05.3650846Z >>> # Test the conditional behavior 2025-10-10T01:45:05.3651137Z >>> x_positive = torch.tensor([1.0, 2.0]) 2025-10-10T01:45:05.3651436Z >>> x_mixed = torch.tensor([-1.0, 2.0]) 2025-10-10T01:45:05.3651716Z >>> torch.sin(x_positive) 2025-10-10T01:45:05.3651981Z tensor([0., 0.]) 2025-10-10T01:45:05.3652220Z >>> torch.sin(x_mixed) 2025-10-10T01:45:05.3652470Z tensor([-0.8415, 0.9093]) 2025-10-10T01:45:05.3652707Z 2025-10-10T01:45:05.3653206Z Original Error: SyntaxError('invalid syntax', ('', 23, 7, 'which needs to be the first argument to ``kernel.call_boxed``\n', 23, 12)) 2025-10-10T01:45:05.3653752Z 2025-10-10T01:45:05.3654145Z which needs to be the first argument to ``kernel.call_boxed`` 2025-10-10T01:45:05.3654469Z ^ 2025-10-10T01:45:05.3654664Z warnings.warn(msg) 2025-10-10T01:45:05.3654882Z 2025-10-10T01:45:05.3655132Z --- Parse Warning: 4 / 18 --- 2025-10-10T01:45:05.3655985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=cudart in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py line=435. 2025-10-10T01:45:05.3657060Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3657464Z Retrieves the CUDA runtime API module. 2025-10-10T01:45:05.3657722Z 2025-10-10T01:45:05.3657889Z 2025-10-10T01:45:05.3658191Z This function initializes the CUDA runtime environment if it is not already 2025-10-10T01:45:05.3658680Z initialized and returns the CUDA runtime API module (_cudart). The CUDA 2025-10-10T01:45:05.3659149Z runtime API module provides access to various CUDA runtime functions. 2025-10-10T01:45:05.3659503Z 2025-10-10T01:45:05.3659667Z Args: 2025-10-10T01:45:05.3659861Z ``None`` 2025-10-10T01:45:05.3660062Z 2025-10-10T01:45:05.3660239Z Returns: 2025-10-10T01:45:05.3660488Z module: The CUDA runtime API module (_cudart). 2025-10-10T01:45:05.3660782Z 2025-10-10T01:45:05.3660951Z Raises: 2025-10-10T01:45:05.3661258Z RuntimeError: If CUDA cannot be re-initialized in a forked subprocess. 2025-10-10T01:45:05.3661841Z AssertionError: If PyTorch is not compiled with CUDA support or if libcudart functions are unavailable. 2025-10-10T01:45:05.3662303Z 2025-10-10T01:45:05.3662514Z Example of CUDA operations with profiling: 2025-10-10T01:45:05.3662800Z >>> import torch 2025-10-10T01:45:05.3663070Z >>> from torch.cuda import cudart, check_error 2025-10-10T01:45:05.3663354Z >>> import os 2025-10-10T01:45:05.3663572Z >>> 2025-10-10T01:45:05.3663786Z >>> os.environ["CUDA_PROFILE"] = "1" 2025-10-10T01:45:05.3664047Z >>> 2025-10-10T01:45:05.3664280Z >>> def perform_cuda_operations_with_streams(): 2025-10-10T01:45:05.3664591Z >>> stream = torch.cuda.Stream() 2025-10-10T01:45:05.3664888Z >>> with torch.cuda.stream(stream): 2025-10-10T01:45:05.3665193Z >>> x = torch.randn(100, 100, device='cuda') 2025-10-10T01:45:05.3665501Z >>> y = torch.randn(100, 100, device='cuda') 2025-10-10T01:45:05.3665797Z >>> z = torch.mul(x, y) 2025-10-10T01:45:05.3666057Z >>> return z 2025-10-10T01:45:05.3666276Z >>> 2025-10-10T01:45:05.3666491Z >>> torch.cuda.synchronize() 2025-10-10T01:45:05.3666795Z >>> print("====== Start nsys profiling ======") 2025-10-10T01:45:05.3667120Z >>> check_error(cudart().cudaProfilerStart()) 2025-10-10T01:45:05.3667454Z >>> with torch.autograd.profiler.emit_nvtx(): 2025-10-10T01:45:05.3667798Z >>> result = perform_cuda_operations_with_streams() 2025-10-10T01:45:05.3668134Z >>> print("CUDA operations completed.") 2025-10-10T01:45:05.3668469Z >>> check_error(torch.cuda.cudart().cudaProfilerStop()) 2025-10-10T01:45:05.3668806Z >>> print("====== End nsys profiling ======") 2025-10-10T01:45:05.3669076Z 2025-10-10T01:45:05.3669345Z To run this example and save the profiling information, execute: 2025-10-10T01:45:05.3669925Z >>> $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-10-10T01:45:05.3670384Z 2025-10-10T01:45:05.3670687Z This command profiles the CUDA operations in the provided script and saves 2025-10-10T01:45:05.3671146Z the profiling information to a file named `trace_name.prof`. 2025-10-10T01:45:05.3671730Z The `--profile-from-start off` option ensures that profiling starts only 2025-10-10T01:45:05.3672151Z after the `cudaProfilerStart` call in the script. 2025-10-10T01:45:05.3672548Z The `--csv` and `--print-summary` options format the profiling output as a 2025-10-10T01:45:05.3672944Z CSV file and print a summary, respectively. 2025-10-10T01:45:05.3673348Z The `-o` option specifies the output file name, and the `-f` option forces the 2025-10-10T01:45:05.3673774Z overwrite of the output file if it already exists. 2025-10-10T01:45:05.3674367Z 2025-10-10T01:45:05.3675014Z Original Error: SyntaxError('invalid syntax', ('', 1, 1, '$ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py\n', 1, 2)) 2025-10-10T01:45:05.3675696Z 2025-10-10T01:45:05.3676077Z $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-10-10T01:45:05.3676526Z ^ 2025-10-10T01:45:05.3676712Z warnings.warn(msg) 2025-10-10T01:45:05.3676920Z 2025-10-10T01:45:05.3677170Z --- Parse Warning: 5 / 18 --- 2025-10-10T01:45:05.3678062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=is_available in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=66. 2025-10-10T01:45:05.3679235Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3679744Z Check if the current accelerator is available at runtime: it was build, all the 2025-10-10T01:45:05.3680210Z required drivers are available and at least one device is visible. 2025-10-10T01:45:05.3680597Z See :ref:`accelerator` for details. 2025-10-10T01:45:05.3680868Z 2025-10-10T01:45:05.3681030Z Returns: 2025-10-10T01:45:05.3681372Z bool: A boolean indicating if there is an available :ref:`accelerator`. 2025-10-10T01:45:05.3681755Z 2025-10-10T01:45:05.3682048Z .. note:: This API delegates to the device-specific version of `is_available`. 2025-10-10T01:45:05.3682547Z On CUDA, when the environment variable ``PYTORCH_NVML_BASED_CUDA_CHECK=1`` is set, 2025-10-10T01:45:05.3683049Z this function will NOT poison fork. Otherwise, it will. For more details, see 2025-10-10T01:45:05.3683471Z :ref:`multiprocessing-poison-fork-note`. 2025-10-10T01:45:05.3683749Z 2025-10-10T01:45:05.3683933Z Example:: 2025-10-10T01:45:05.3684119Z 2025-10-10T01:45:05.3684432Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:45:05.3684809Z 2025-10-10T01:45:05.3685355Z Original Error: SyntaxError('invalid syntax', ('', 1, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 1, 78)) 2025-10-10T01:45:05.3685959Z 2025-10-10T01:45:05.3686276Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:45:05.3686675Z ^ 2025-10-10T01:45:05.3686930Z warnings.warn(msg) 2025-10-10T01:45:05.3687137Z 2025-10-10T01:45:05.3687375Z --- Parse Warning: 6 / 18 --- 2025-10-10T01:45:05.3688283Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=synchronize in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=212. 2025-10-10T01:45:05.3689288Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3689759Z Wait for all kernels in all streams on the given device to complete. 2025-10-10T01:45:05.3690096Z 2025-10-10T01:45:05.3690271Z Args: 2025-10-10T01:45:05.3690637Z device (:class:`torch.device`, str, int, optional): device for which to synchronize. It must match 2025-10-10T01:45:05.3691381Z the current :ref:`accelerator` device type. If not given, 2025-10-10T01:45:05.3700929Z use :func:`torch.accelerator.current_device_index` by default. 2025-10-10T01:45:05.3701286Z 2025-10-10T01:45:05.3701663Z .. note:: This function is a no-op if the current :ref:`accelerator` is not initialized. 2025-10-10T01:45:05.3702094Z 2025-10-10T01:45:05.3702267Z Example:: 2025-10-10T01:45:05.3702457Z 2025-10-10T01:45:05.3702675Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA) 2025-10-10T01:45:05.3703338Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:45:05.3703768Z >>> start_event = torch.Event(enable_timing=True) 2025-10-10T01:45:05.3704087Z >>> end_event = torch.Event(enable_timing=True) 2025-10-10T01:45:05.3704384Z >>> start_event.record() 2025-10-10T01:45:05.3704765Z >>> tensor = torch.randn(100, device=torch.accelerator.current_accelerator()) 2025-10-10T01:45:05.3705149Z >>> sum = torch.sum(tensor) 2025-10-10T01:45:05.3705405Z >>> end_event.record() 2025-10-10T01:45:05.3705686Z >>> torch.accelerator.synchronize() 2025-10-10T01:45:05.3706030Z >>> elapsed_time_ms = start_event.elapsed_time(end_event) 2025-10-10T01:45:05.3706332Z 2025-10-10T01:45:05.3706886Z Original Error: SyntaxError('invalid syntax', ('', 2, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 2, 78)) 2025-10-10T01:45:05.3707488Z 2025-10-10T01:45:05.3707791Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-10-10T01:45:05.3708177Z ^ 2025-10-10T01:45:05.3708427Z warnings.warn(msg) 2025-10-10T01:45:05.3708633Z 2025-10-10T01:45:05.3708920Z --- Parse Warning: 7 / 18 --- 2025-10-10T01:45:05.3709895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=unsafe_generate_fake_kernels in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_profile.py line=94. 2025-10-10T01:45:05.3710991Z Caused by: DoctestParseError('Failed to parse doctest in _label_docsrc_lines') 2025-10-10T01:45:05.3711378Z 2025-10-10T01:45:05.3711664Z Registers a fake kernel based on the given operator profiles. This fake 2025-10-10T01:45:05.3712156Z kernel registration will override any existing fake kernel registrations. 2025-10-10T01:45:05.3712522Z 2025-10-10T01:45:05.3712787Z The input is a dictionary mapping operator names to a set of operator 2025-10-10T01:45:05.3713244Z profiles, which we will use to generate fake kernels. The operator profiles 2025-10-10T01:45:05.3713686Z are a record of the input and output tensor metadata. Based on this 2025-10-10T01:45:05.3714209Z information we will match a given input to the recorded profile, and return 2025-10-10T01:45:05.3714673Z an output with the same metadata as in the recorded profile. If a profile 2025-10-10T01:45:05.3715060Z doesn't exist then an exception will be thrown. 2025-10-10T01:45:05.3715328Z 2025-10-10T01:45:05.3715596Z The fake kernel generation is considered unsafe because it relies on the 2025-10-10T01:45:05.3716057Z rigid, pre-defined operator profiles that do not account for potential 2025-10-10T01:45:05.3716527Z variations in output behavior. Specifically, the generated kernels assume a 2025-10-10T01:45:05.3717024Z fixed relationship between input and output ranks. However, in reality, it's 2025-10-10T01:45:05.3717511Z possible that data-dependent operations may produce outputs of different 2025-10-10T01:45:05.3717967Z ranks even when given inputs of the same rank. The generated fake kernels 2025-10-10T01:45:05.3718412Z are inflexible and unable to accommodate these nuances, making them 2025-10-10T01:45:05.3718756Z potentially unsafe. 2025-10-10T01:45:05.3718967Z 2025-10-10T01:45:05.3719310Z Args: 2025-10-10T01:45:05.3719594Z op_profiles (dict[str, set[OpProfile]]): A dictionary mapping operator 2025-10-10T01:45:05.3720029Z name to a set of operator profiles from which we will generate fake 2025-10-10T01:45:05.3720349Z kernels. 2025-10-10T01:45:05.3720549Z 2025-10-10T01:45:05.3720728Z Examples: 2025-10-10T01:45:05.3720925Z 2025-10-10T01:45:05.3721162Z >>> # Example: Registering an op-profile from draft-export 2025-10-10T01:45:05.3721653Z >>> import torch 2025-10-10T01:45:05.3721958Z >>> from torch.export._draft_export import draft_export 2025-10-10T01:45:05.3722258Z >>> 2025-10-10T01:45:05.3722513Z >>> @torch.library.custom_op("mylib::foo", mutates_args=()) 2025-10-10T01:45:05.3722858Z >>> def foo(x: Tensor, y: Tensor) -> Tensor: 2025-10-10T01:45:05.3723140Z >>> return x + y 2025-10-10T01:45:05.3723366Z >>> 2025-10-10T01:45:05.3723585Z >>> class M(torch.nn.Module): 2025-10-10T01:45:05.3723855Z >>> def forward(self, a, b): 2025-10-10T01:45:05.3724149Z >>> res = torch.ops.mylib.foo(a, b) # no fake impl 2025-10-10T01:45:05.3724442Z >>> return res 2025-10-10T01:45:05.3724656Z >>> 2025-10-10T01:45:05.3724895Z >>> ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)) 2025-10-10T01:45:05.3725189Z >>> 2025-10-10T01:45:05.3725530Z >>> with torch._library.fake_profile.unsafe_generate_fake_kernels(ep._report.op_profiles): 2025-10-10T01:45:05.3725971Z >>> decomp = ep.run_decompositions() 2025-10-10T01:45:05.3726218Z 2025-10-10T01:45:05.3726375Z 2025-10-10T01:45:05.3726842Z Original Error: IncompleteParseError('ill-formed doctest: all parts have been processed but the doctest source is not balanced') 2025-10-10T01:45:05.3727371Z 2025-10-10T01:45:05.3727542Z warnings.warn(msg) 2025-10-10T01:45:05.3727752Z 2025-10-10T01:45:05.3727996Z --- Parse Warning: 8 / 18 --- 2025-10-10T01:45:05.3728950Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=CustomOpDef.register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py line=401. 2025-10-10T01:45:05.3729981Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3730421Z Register a FakeTensor implementation for this custom op. 2025-10-10T01:45:05.3730731Z 2025-10-10T01:45:05.3731022Z This is necessary to get the operator to work efficiently with torch.compile. 2025-10-10T01:45:05.3731377Z 2025-10-10T01:45:05.3731645Z The Fake impl (sometimes also known as a meta kernel or abstract impl) 2025-10-10T01:45:05.3732096Z specifies the behavior of this operator on Tensors that carry no data. 2025-10-10T01:45:05.3732491Z Given some input Tensors with certain properties 2025-10-10T01:45:05.3732916Z (sizes/strides/storage_offset/device), it specifies what the properties of 2025-10-10T01:45:05.3733306Z the output Tensors are. 2025-10-10T01:45:05.3733534Z 2025-10-10T01:45:05.3733798Z Please see :func:`torch.library.register_fake` for more details. 2025-10-10T01:45:05.3734119Z 2025-10-10T01:45:05.3734274Z Args: 2025-10-10T01:45:05.3734532Z fn (Callable): The function to register as the FakeTensor 2025-10-10T01:45:05.3734851Z implementation. 2025-10-10T01:45:05.3735074Z 2025-10-10T01:45:05.3735243Z Examples: 2025-10-10T01:45:05.3735453Z >>> import torch 2025-10-10T01:45:05.3735691Z >>> import numpy as np 2025-10-10T01:45:05.3735944Z >>> from torch import Tensor 2025-10-10T01:45:05.3736187Z >>> 2025-10-10T01:45:05.3736457Z >>> # Example 1: an operator without data-dependent output shape 2025-10-10T01:45:05.3736990Z >>> @torch.library.custom_op("mylib::linear", mutates_args=()) 2025-10-10T01:45:05.3737397Z >>> def linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-10-10T01:45:05.3737746Z >>> return (x @ weight.t()) + bias 2025-10-10T01:45:05.3737999Z >>> 2025-10-10T01:45:05.3738204Z >>> @linear.register_fake 2025-10-10T01:45:05.3738464Z >>> def _(x, weight, bias): 2025-10-10T01:45:05.3738852Z >>> assert x.dim() == 2 2025-10-10T01:45:05.3739117Z >>> assert weight.dim() == 2 2025-10-10T01:45:05.3739391Z >>> assert bias.dim() == 1 2025-10-10T01:45:05.3739679Z >>> assert x.shape[1] == weight.shape[1] 2025-10-10T01:45:05.3739984Z >>> assert weight.shape[0] == bias.shape[0] 2025-10-10T01:45:05.3740290Z >>> assert x.device == weight.device 2025-10-10T01:45:05.3740623Z >>> return x.new_empty(x.size(0), weight.size(0)) 2025-10-10T01:45:05.3740915Z >>> 2025-10-10T01:45:05.3741119Z >>> x = torch.randn(2, 2) 2025-10-10T01:45:05.3741386Z >>> weight = torch.randn(2, 2) 2025-10-10T01:45:05.3741654Z >>> bias = torch.randn(2) 2025-10-10T01:45:05.3741943Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:45:05.3742307Z >>> out = torch.compile(linear, fullgraph=True)(x, weight, bias) 2025-10-10T01:45:05.3742678Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:45:05.3743090Z >>> assert torch.allclose(out, torch.nn.functional.linear(x, weight, bias)) 2025-10-10T01:45:05.3743457Z >>> 2025-10-10T01:45:05.3743717Z >>> # Example 2: an operator with data-dependent output shape 2025-10-10T01:45:05.3744115Z >>> @torch.library.custom_op("mylib::nonzero", mutates_args=()) 2025-10-10T01:45:05.3744479Z >>> def nonzero(x: Tensor) -> Tensor: 2025-10-10T01:45:05.3744761Z >>> x_np = x.cpu().numpy() 2025-10-10T01:45:05.3745050Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-10-10T01:45:05.3745364Z >>> return torch.tensor(res, device=x.device) 2025-10-10T01:45:05.3745636Z >>> 2025-10-10T01:45:05.3745846Z >>> @nonzero.register_fake 2025-10-10T01:45:05.3746105Z >>> def _(x): 2025-10-10T01:45:05.3746388Z >>> # Number of nonzero-elements is data-dependent. 2025-10-10T01:45:05.3746753Z >>> # Since we cannot peek at the data in an abstract impl, 2025-10-10T01:45:05.3747128Z >>> # we use the ctx object to construct a new symint that 2025-10-10T01:45:05.3747462Z >>> # represents the data-dependent size. 2025-10-10T01:45:05.3747764Z >>> ctx = torch.library.get_ctx() 2025-10-10T01:45:05.3748070Z >>> nnz = ctx.new_dynamic_size() 2025-10-10T01:45:05.3748353Z >>> shape = [nnz, x.dim()] 2025-10-10T01:45:05.3748652Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-10-10T01:45:05.3748953Z >>> return result 2025-10-10T01:45:05.3749181Z >>> 2025-10-10T01:45:05.3749394Z >>> x = torch.tensor([0, 1, 2, 0, 0, 1]) 2025-10-10T01:45:05.3749693Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:45:05.3750024Z >>> out = torch.compile(nonzero, fullgraph=True)(x) 2025-10-10T01:45:05.3750343Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-10-10T01:45:05.3750650Z >>> assert torch.allclose(out, x.nonzero()) 2025-10-10T01:45:05.3750918Z 2025-10-10T01:45:05.3751079Z 2025-10-10T01:45:05.3751743Z Original Error: IndentationError('expected an indented block after function definition on line 36', ('', 37, 1, '_._ = None\n', 37, 2)) 2025-10-10T01:45:05.3752320Z 2025-10-10T01:45:05.3752481Z _._ = None 2025-10-10T01:45:05.3752654Z ^ 2025-10-10T01:45:05.3752824Z warnings.warn(msg) 2025-10-10T01:45:05.3753020Z 2025-10-10T01:45:05.3753263Z --- Parse Warning: 9 / 18 --- 2025-10-10T01:45:05.3754289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=annotate in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py line=244. 2025-10-10T01:45:05.3755446Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3755808Z 2025-10-10T01:45:05.3756075Z Temporarily adds custom annotations to the current tracing context. 2025-10-10T01:45:05.3756492Z The fx_node produced from this tracing context will have the 2025-10-10T01:45:05.3756851Z custom annotations in node.metadata["custom"] field. 2025-10-10T01:45:05.3757138Z 2025-10-10T01:45:05.3757426Z This context manager allows you to insert arbitrary metadata into the PT2 2025-10-10T01:45:05.3757895Z tracing system by updating the global `current_meta["custom"]` dictionary. 2025-10-10T01:45:05.3758344Z The annotations are automatically reverted after the context exits. 2025-10-10T01:45:05.3758683Z 2025-10-10T01:45:05.3759008Z This is intended for advanced users who need to attach additional metadata to the fx nodes 2025-10-10T01:45:05.3759525Z (e.g., for debugging, analysis, or external tooling) during export tracing. 2025-10-10T01:45:05.3759876Z 2025-10-10T01:45:05.3760043Z Note: 2025-10-10T01:45:05.3760327Z This API is **not backward compatible** and may evolve in future releases. 2025-10-10T01:45:05.3760663Z 2025-10-10T01:45:05.3760814Z Note: 2025-10-10T01:45:05.3761104Z This API is not compatible with fx.symbolic_trace or jit.trace. It's intended 2025-10-10T01:45:05.3761556Z to be used with PT2 family of tracers, e.g. torch.export and dynamo. 2025-10-10T01:45:05.3761868Z 2025-10-10T01:45:05.3762019Z Args: 2025-10-10T01:45:05.3762297Z annotation_dict (dict): A dictionary of custom key-value pairs to inject 2025-10-10T01:45:05.3762653Z into the FX trace metadata. 2025-10-10T01:45:05.3762879Z 2025-10-10T01:45:05.3763035Z Example: 2025-10-10T01:45:05.3763268Z >>> with annotate({"source": "custom_pass", "tag": 42}): 2025-10-10T01:45:05.3763567Z ... # compute here 2025-10-10T01:45:05.3763868Z # After exiting the context, custom annotations are removed. 2025-10-10T01:45:05.3764165Z 2025-10-10T01:45:05.3764674Z Original Error: IndentationError("expected an indented block after 'with' statement on line 1", ('', 2, 19, ' # compute here\n', 2, -1)) 2025-10-10T01:45:05.3765240Z 2025-10-10T01:45:05.3765398Z # compute here 2025-10-10T01:45:05.3765593Z ^ 2025-10-10T01:45:05.3765794Z warnings.warn(msg) 2025-10-10T01:45:05.3765993Z 2025-10-10T01:45:05.3766233Z --- Parse Warning: 10 / 18 --- 2025-10-10T01:45:05.3767158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=ReduceLROnPlateau in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py line=1587. 2025-10-10T01:45:05.3768166Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3768620Z Reduce learning rate when a metric has stopped improving. 2025-10-10T01:45:05.3768927Z 2025-10-10T01:45:05.3769185Z Models often benefit from reducing the learning rate by a factor 2025-10-10T01:45:05.3769592Z of 2-10 once learning stagnates. This scheduler reads a metrics 2025-10-10T01:45:05.3769994Z quantity and if no improvement is seen for a 'patience' number 2025-10-10T01:45:05.3770342Z of epochs, the learning rate is reduced. 2025-10-10T01:45:05.3770603Z 2025-10-10T01:45:05.3770975Z Args: 2025-10-10T01:45:05.3771202Z optimizer (Optimizer): Wrapped optimizer. 2025-10-10T01:45:05.3771527Z mode (str): One of `min`, `max`. In `min` mode, lr will 2025-10-10T01:45:05.3771873Z be reduced when the quantity monitored has stopped 2025-10-10T01:45:05.3772228Z decreasing; in `max` mode it will be reduced when the 2025-10-10T01:45:05.3772606Z quantity monitored has stopped increasing. Default: 'min'. 2025-10-10T01:45:05.3773128Z factor (float): Factor by which the learning rate will be 2025-10-10T01:45:05.3773491Z reduced. new_lr = lr * factor. Default: 0.1. 2025-10-10T01:45:05.3773868Z patience (int): The number of allowed epochs with no improvement after 2025-10-10T01:45:05.3774242Z which the learning rate will be reduced. 2025-10-10T01:45:05.3774614Z For example, consider the case of having no patience (`patience = 0`). 2025-10-10T01:45:05.3775193Z In the first epoch, a baseline is established and is always considered good as there's no previous baseline. 2025-10-10T01:45:05.3775737Z In the second epoch, if the performance is worse than the baseline, 2025-10-10T01:45:05.3776119Z we have what is considered an intolerable epoch. 2025-10-10T01:45:05.3776540Z Since the count of intolerable epochs (1) is greater than the patience level (0), 2025-10-10T01:45:05.3776989Z the learning rate is reduced at the end of this epoch. 2025-10-10T01:45:05.3777468Z From the third epoch onwards, the learning rate continues to be reduced at the end of each epoch 2025-10-10T01:45:05.3778066Z if the performance is worse than the baseline. If the performance improves or remains the same, 2025-10-10T01:45:05.3778501Z the learning rate is not adjusted. 2025-10-10T01:45:05.3778767Z Default: 10. 2025-10-10T01:45:05.3779072Z threshold (float): Threshold for measuring the new optimum, 2025-10-10T01:45:05.3779444Z to only focus on significant changes. Default: 1e-4. 2025-10-10T01:45:05.3779799Z threshold_mode (str): One of `rel`, `abs`. In `rel` mode, 2025-10-10T01:45:05.3780150Z dynamic_threshold = best * ( 1 + threshold ) in 'max' 2025-10-10T01:45:05.3780481Z mode or best * ( 1 - threshold ) in `min` mode. 2025-10-10T01:45:05.3780817Z In `abs` mode, dynamic_threshold = best + threshold in 2025-10-10T01:45:05.3781171Z `max` mode or best - threshold in `min` mode. Default: 'rel'. 2025-10-10T01:45:05.3781543Z cooldown (int): Number of epochs to wait before resuming 2025-10-10T01:45:05.3781913Z normal operation after lr has been reduced. Default: 0. 2025-10-10T01:45:05.3782269Z min_lr (float or list): A scalar or a list of scalars. A 2025-10-10T01:45:05.3782618Z lower bound on the learning rate of all param groups 2025-10-10T01:45:05.3782939Z or each group respectively. Default: 0. 2025-10-10T01:45:05.3783276Z eps (float): Minimal decay applied to lr. If the difference 2025-10-10T01:45:05.3783649Z between new and old lr is smaller than eps, the update is 2025-10-10T01:45:05.3783972Z ignored. Default: 1e-8. 2025-10-10T01:45:05.3784200Z 2025-10-10T01:45:05.3784361Z Example: 2025-10-10T01:45:05.3784565Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3784905Z >>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9) 2025-10-10T01:45:05.3785313Z >>> scheduler = ReduceLROnPlateau(optimizer, "min") 2025-10-10T01:45:05.3785621Z >>> for epoch in range(10): 2025-10-10T01:45:05.3785868Z >>> train(...) 2025-10-10T01:45:05.3786101Z >>> val_loss = validate(...) 2025-10-10T01:45:05.3786539Z >>> # Note that step should be called after validate() 2025-10-10T01:45:05.3786849Z >>> scheduler.step(val_loss) 2025-10-10T01:45:05.3787087Z 2025-10-10T01:45:05.3787334Z .. image:: ../scripts/lr_scheduler_images/ReduceLROnPlateau.png 2025-10-10T01:45:05.3787642Z 2025-10-10T01:45:05.3788067Z Original Error: IndentationError('unexpected indent', ('', 8, 4, ' scheduler.step(val_loss)\n', 8, -1)) 2025-10-10T01:45:05.3788551Z 2025-10-10T01:45:05.3788727Z scheduler.step(val_loss) 2025-10-10T01:45:05.3789080Z ^ 2025-10-10T01:45:05.3789251Z warnings.warn(msg) 2025-10-10T01:45:05.3789449Z 2025-10-10T01:45:05.3789692Z --- Parse Warning: 11 / 18 --- 2025-10-10T01:45:05.3790820Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=ActivationSparsifier in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py line=16. 2025-10-10T01:45:05.3792017Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3792371Z 2025-10-10T01:45:05.3792667Z The Activation sparsifier class aims to sparsify/prune activations in a neural 2025-10-10T01:45:05.3793146Z network. The idea is to attach the sparsifier to a layer (or layers) and it 2025-10-10T01:45:05.3793612Z zeroes out the activations based on the mask_fn (or sparsification function) 2025-10-10T01:45:05.3793981Z input by the user. 2025-10-10T01:45:05.3794356Z The mask_fn is applied once all the inputs are aggregated and reduced i.e. 2025-10-10T01:45:05.3794747Z mask = mask_fn(reduce_fn(aggregate_fn(activations))) 2025-10-10T01:45:05.3795021Z 2025-10-10T01:45:05.3795178Z Note:: 2025-10-10T01:45:05.3795527Z The sparsification mask is computed on the input **before it goes through the attached layer**. 2025-10-10T01:45:05.3795933Z 2025-10-10T01:45:05.3796082Z Args: 2025-10-10T01:45:05.3796263Z model (nn.Module): 2025-10-10T01:45:05.3796574Z The model whose layers will be sparsified. The layers that needs to be 2025-10-10T01:45:05.3797037Z sparsified should be added separately using the register_layer() function 2025-10-10T01:45:05.3797414Z aggregate_fn (Optional, Callable): 2025-10-10T01:45:05.3797793Z default aggregate_fn that is used if not specified while registering the layer. 2025-10-10T01:45:05.3798233Z specifies how inputs should be aggregated over time. 2025-10-10T01:45:05.3798677Z The aggregate_fn should usually take 2 torch tensors and return the aggregated tensor. 2025-10-10T01:45:05.3799062Z Example 2025-10-10T01:45:05.3799318Z def add_agg_fn(tensor1, tensor2): return tensor1 + tensor2 2025-10-10T01:45:05.3799647Z reduce_fn (Optional, Callable): 2025-10-10T01:45:05.3800028Z default reduce_fn that is used if not specified while registering the layer. 2025-10-10T01:45:05.3800514Z reduce_fn will be called on the aggregated tensor i.e. the tensor obtained after 2025-10-10T01:45:05.3800909Z calling agg_fn() on all inputs. 2025-10-10T01:45:05.3801170Z Example 2025-10-10T01:45:05.3801463Z def mean_reduce_fn(agg_tensor): return agg_tensor.mean(dim=0) 2025-10-10T01:45:05.3801803Z mask_fn (Optional, Callable): 2025-10-10T01:45:05.3802227Z default mask_fn that is used to create the sparsification mask using the tensor obtained after 2025-10-10T01:45:05.3802768Z calling the reduce_fn(). This is used by default if a custom one is passed in the 2025-10-10T01:45:05.3803146Z register_layer(). 2025-10-10T01:45:05.3803584Z Note that the mask_fn() definition should contain the sparse arguments that is passed in sparse_config 2025-10-10T01:45:05.3804202Z arguments. 2025-10-10T01:45:05.3804442Z features (Optional, list): 2025-10-10T01:45:05.3804721Z default selected features to sparsify. 2025-10-10T01:45:05.3805130Z If this is non-empty, then the mask_fn will be applied for each feature of the input. 2025-10-10T01:45:05.3805518Z For example, 2025-10-10T01:45:05.3805866Z mask = [mask_fn(reduce_fn(aggregated_fn(input[feature])) for feature in features] 2025-10-10T01:45:05.3806407Z feature_dim (Optional, int): 2025-10-10T01:45:05.3806800Z default dimension of input features. Again, features along this dim will be chosen 2025-10-10T01:45:05.3807192Z for sparsification. 2025-10-10T01:45:05.3807445Z sparse_config (Dict): 2025-10-10T01:45:05.3807786Z Default configuration for the mask_fn. This config will be passed 2025-10-10T01:45:05.3808130Z with the mask_fn() 2025-10-10T01:45:05.3808362Z 2025-10-10T01:45:05.3808518Z Example: 2025-10-10T01:45:05.3808701Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3808928Z >>> model = SomeModel() 2025-10-10T01:45:05.3809269Z >>> act_sparsifier = ActivationSparsifier(...) # init activation sparsifier 2025-10-10T01:45:05.3809637Z >>> # Initialize aggregate_fn 2025-10-10T01:45:05.3809875Z >>> def agg_fn(x, y): 2025-10-10T01:45:05.3810091Z >>> return x + y 2025-10-10T01:45:05.3810303Z >>> 2025-10-10T01:45:05.3810481Z >>> # Initialize reduce_fn 2025-10-10T01:45:05.3810712Z >>> def reduce_fn(x): 2025-10-10T01:45:05.3810935Z >>> return torch.mean(x, dim=0) 2025-10-10T01:45:05.3811172Z >>> 2025-10-10T01:45:05.3811342Z >>> # Initialize mask_fn 2025-10-10T01:45:05.3811565Z >>> def mask_fn(data): 2025-10-10T01:45:05.3811820Z >>> return torch.eye(data.shape).to(data.device) 2025-10-10T01:45:05.3812085Z >>> 2025-10-10T01:45:05.3812253Z >>> 2025-10-10T01:45:05.3812443Z >>> act_sparsifier.register_layer( 2025-10-10T01:45:05.3812703Z ... model.some_layer, 2025-10-10T01:45:05.3812933Z ... aggregate_fn=agg_fn, 2025-10-10T01:45:05.3813171Z ... reduce_fn=reduce_fn, 2025-10-10T01:45:05.3813399Z ... mask_fn=mask_fn, 2025-10-10T01:45:05.3813610Z ... ) 2025-10-10T01:45:05.3813779Z >>> 2025-10-10T01:45:05.3813955Z >>> # start training process 2025-10-10T01:45:05.3814195Z >>> for _ in [...]: 2025-10-10T01:45:05.3814408Z >>> # epoch starts 2025-10-10T01:45:05.3814682Z >>> # model.forward(), compute_loss() and model.backwards() 2025-10-10T01:45:05.3814983Z >>> # epoch ends 2025-10-10T01:45:05.3815198Z >>> act_sparsifier.step() 2025-10-10T01:45:05.3815440Z >>> # end training process 2025-10-10T01:45:05.3815681Z >>> sparsifier.squash_mask() 2025-10-10T01:45:05.3815903Z 2025-10-10T01:45:05.3816396Z Original Error: IndentationError("expected an indented block after 'for' statement on line 25", ('', 26, 1, '_._ = None\n', 26, 2)) 2025-10-10T01:45:05.3816938Z 2025-10-10T01:45:05.3817090Z _._ = None 2025-10-10T01:45:05.3817259Z ^ 2025-10-10T01:45:05.3817434Z warnings.warn(msg) 2025-10-10T01:45:05.3817632Z 2025-10-10T01:45:05.3817868Z --- Parse Warning: 12 / 18 --- 2025-10-10T01:45:05.3818708Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=vmap in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=39. 2025-10-10T01:45:05.3819639Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3819995Z 2025-10-10T01:45:05.3820255Z vmap is the vectorizing map; ``vmap(func)`` returns a new function that 2025-10-10T01:45:05.3820823Z maps ``func`` over some dimension of the inputs. Semantically, vmap 2025-10-10T01:45:05.3821252Z pushes the map into PyTorch operations called by ``func``, effectively 2025-10-10T01:45:05.3821605Z vectorizing those operations. 2025-10-10T01:45:05.3821829Z 2025-10-10T01:45:05.3822087Z vmap is useful for handling batch dimensions: one can write a function 2025-10-10T01:45:05.3822505Z ``func`` that runs on examples and then lift it to a function that can 2025-10-10T01:45:05.3822918Z take batches of examples with ``vmap(func)``. vmap can also be used to 2025-10-10T01:45:05.3823469Z compute batched gradients when composed with autograd. 2025-10-10T01:45:05.3823756Z 2025-10-10T01:45:05.3823912Z .. note:: 2025-10-10T01:45:05.3824167Z :func:`torch.vmap` is aliased to :func:`torch.func.vmap` for 2025-10-10T01:45:05.3824509Z convenience. Use whichever one you'd like. 2025-10-10T01:45:05.3824770Z 2025-10-10T01:45:05.3824926Z Args: 2025-10-10T01:45:05.3825201Z func (function): A Python function that takes one or more arguments. 2025-10-10T01:45:05.3825554Z Must return one or more Tensors. 2025-10-10T01:45:05.3825887Z in_dims (int or nested structure): Specifies which dimension of the 2025-10-10T01:45:05.3826276Z inputs should be mapped over. ``in_dims`` should have a 2025-10-10T01:45:05.3826657Z structure like the inputs. If the ``in_dim`` for a particular 2025-10-10T01:45:05.3827043Z input is None, then that indicates there is no map dimension. 2025-10-10T01:45:05.3827363Z Default: 0. 2025-10-10T01:45:05.3827656Z out_dims (int or Tuple[int]): Specifies where the mapped dimension 2025-10-10T01:45:05.3828052Z should appear in the outputs. If ``out_dims`` is a Tuple, then 2025-10-10T01:45:05.3828422Z it should have one element per output. Default: 0. 2025-10-10T01:45:05.3828788Z randomness (str): Specifies whether the randomness in this 2025-10-10T01:45:05.3829201Z vmap should be the same or different across batches. If 'different', 2025-10-10T01:45:05.3829620Z the randomness for each batch will be different. If 'same', the 2025-10-10T01:45:05.3830044Z randomness will be the same across batches. If 'error', any calls to 2025-10-10T01:45:05.3830469Z random functions will error. Default: 'error'. WARNING: this flag 2025-10-10T01:45:05.3830887Z only applies to random PyTorch operations and does not apply to 2025-10-10T01:45:05.3831264Z Python's random module or numpy randomness. 2025-10-10T01:45:05.3831661Z chunk_size (None or int): If None (default), apply a single vmap over inputs. 2025-10-10T01:45:05.3832114Z If not None, then compute the vmap :attr:`chunk_size` samples at a time. 2025-10-10T01:45:05.3832590Z Note that :attr:`chunk_size=1` is equivalent to computing the vmap with a for-loop. 2025-10-10T01:45:05.3833106Z If you run into memory issues computing the vmap, please try a non-None chunk_size. 2025-10-10T01:45:05.3833470Z 2025-10-10T01:45:05.3833634Z Returns: 2025-10-10T01:45:05.3833889Z Returns a new "batched" function. It takes the same inputs as 2025-10-10T01:45:05.3834360Z ``func``, except each input has an extra dimension at the index 2025-10-10T01:45:05.3834750Z specified by ``in_dims``. It takes returns the same outputs as 2025-10-10T01:45:05.3835132Z ``func``, except each output has an extra dimension at the index 2025-10-10T01:45:05.3835465Z specified by ``out_dims``. 2025-10-10T01:45:05.3835694Z 2025-10-10T01:45:05.3835860Z .. warning: 2025-10-10T01:45:05.3836141Z :func:`vmap` works best with functional-style code. Please do not 2025-10-10T01:45:05.3836541Z perform any side-effects in ``func``, with the exception of 2025-10-10T01:45:05.3836971Z in-place PyTorch operations. Examples of side-effects include mutating 2025-10-10T01:45:05.3837601Z Python data structures and assigning values to variables not captured 2025-10-10T01:45:05.3837946Z in ``func``. 2025-10-10T01:45:05.3838132Z 2025-10-10T01:45:05.3838406Z One example of using :func:`vmap` is to compute batched dot products. PyTorch 2025-10-10T01:45:05.3838862Z doesn't provide a batched ``torch.dot`` API; instead of unsuccessfully 2025-10-10T01:45:05.3839296Z rummaging through docs, use :func:`vmap` to construct a new function. 2025-10-10T01:45:05.3839622Z 2025-10-10T01:45:05.3839797Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:45:05.3840409Z >>> batched_dot = torch.func.vmap(torch.dot) # [N, D], [N, D] -> [N] 2025-10-10T01:45:05.3840785Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-10-10T01:45:05.3841059Z >>> batched_dot(x, y) 2025-10-10T01:45:05.3841267Z 2025-10-10T01:45:05.3841540Z :func:`vmap` can be helpful in hiding batch dimensions, leading to a simpler 2025-10-10T01:45:05.3841911Z model authoring experience. 2025-10-10T01:45:05.3842128Z 2025-10-10T01:45:05.3842329Z >>> batch_size, feature_size = 3, 5 2025-10-10T01:45:05.3842655Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-10-10T01:45:05.3842974Z >>> 2025-10-10T01:45:05.3843170Z >>> def model(feature_vec): 2025-10-10T01:45:05.3843458Z >>> # Very simple linear model with activation 2025-10-10T01:45:05.3843774Z >>> return feature_vec.dot(weights).relu() 2025-10-10T01:45:05.3844047Z >>> 2025-10-10T01:45:05.3844294Z >>> examples = torch.randn(batch_size, feature_size) 2025-10-10T01:45:05.3844614Z >>> result = torch.vmap(model)(examples) 2025-10-10T01:45:05.3844875Z 2025-10-10T01:45:05.3845161Z :func:`vmap` can also help vectorize computations that were previously difficult 2025-10-10T01:45:05.3845653Z or impossible to batch. One example is higher-order gradient computation. 2025-10-10T01:45:05.3846126Z The PyTorch autograd engine computes vjps (vector-Jacobian products). 2025-10-10T01:45:05.3846601Z Computing a full Jacobian matrix for some function f: R^N -> R^N usually 2025-10-10T01:45:05.3847080Z requires N calls to ``autograd.grad``, one per Jacobian row. Using :func:`vmap`, 2025-10-10T01:45:05.3847559Z we can vectorize the whole computation, computing the Jacobian in a single 2025-10-10T01:45:05.3847922Z call to ``autograd.grad``. 2025-10-10T01:45:05.3848138Z 2025-10-10T01:45:05.3848294Z >>> # Setup 2025-10-10T01:45:05.3848473Z >>> N = 5 2025-10-10T01:45:05.3848561Z >>> f = lambda x: x**2 2025-10-10T01:45:05.3848676Z >>> x = torch.randn(N, requires_grad=True) 2025-10-10T01:45:05.3848747Z >>> y = f(x) 2025-10-10T01:45:05.3848835Z >>> I_N = torch.eye(N) 2025-10-10T01:45:05.3848901Z >>> 2025-10-10T01:45:05.3848989Z >>> # Sequential approach 2025-10-10T01:45:05.3849165Z >>> jacobian_rows = [torch.autograd.grad(y, x, v, retain_graph=True)[0] 2025-10-10T01:45:05.3849268Z >>> for v in I_N.unbind()] 2025-10-10T01:45:05.3849369Z >>> jacobian = torch.stack(jacobian_rows) 2025-10-10T01:45:05.3849439Z >>> 2025-10-10T01:45:05.3849538Z >>> # vectorized gradient computation 2025-10-10T01:45:05.3849617Z >>> def get_vjp(v): 2025-10-10T01:45:05.3849717Z >>> return torch.autograd.grad(y, x, v) 2025-10-10T01:45:05.3849824Z >>> jacobian = torch.vmap(get_vjp)(I_N) 2025-10-10T01:45:05.3849890Z 2025-10-10T01:45:05.3850104Z :func:`vmap` can also be nested, producing an output with multiple batched dimensions 2025-10-10T01:45:05.3850176Z 2025-10-10T01:45:05.3850266Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:45:05.3850356Z >>> batched_dot = torch.vmap( 2025-10-10T01:45:05.3850448Z ... torch.vmap(torch.dot) 2025-10-10T01:45:05.3850546Z ... ) # [N1, N0, D], [N1, N0, D] -> [N1, N0] 2025-10-10T01:45:05.3850662Z >>> x, y = torch.randn(2, 3, 5), torch.randn(2, 3, 5) 2025-10-10T01:45:05.3850900Z >>> batched_dot(x, y) # tensor of size [2, 3] 2025-10-10T01:45:05.3850972Z 2025-10-10T01:45:05.3851169Z If the inputs are not batched along the first dimension, ``in_dims`` specifies 2025-10-10T01:45:05.3851305Z the dimension that each inputs are batched along as 2025-10-10T01:45:05.3851369Z 2025-10-10T01:45:05.3851456Z >>> torch.dot # [N], [N] -> [] 2025-10-10T01:45:05.3851633Z >>> batched_dot = torch.vmap(torch.dot, in_dims=1) # [N, D], [N, D] -> [D] 2025-10-10T01:45:05.3851861Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-10-10T01:45:05.3851940Z >>> batched_dot( 2025-10-10T01:45:05.3852012Z ... x, y 2025-10-10T01:45:05.3852166Z ... ) # output is [5] instead of [2] if batched along the 0th dimension 2025-10-10T01:45:05.3852234Z 2025-10-10T01:45:05.3852440Z If there are multiple inputs each of which is batched along different dimensions, 2025-10-10T01:45:05.3852612Z ``in_dims`` must be a tuple with the batch dimension for each input as 2025-10-10T01:45:05.3852677Z 2025-10-10T01:45:05.3852764Z >>> torch.dot # [D], [D] -> [] 2025-10-10T01:45:05.3852952Z >>> batched_dot = torch.vmap(torch.dot, in_dims=(0, None)) # [N, D], [D] -> [N] 2025-10-10T01:45:05.3853057Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-10-10T01:45:05.3853132Z >>> batched_dot( 2025-10-10T01:45:05.3853206Z ... x, y 2025-10-10T01:45:05.3853360Z ... ) # second arg doesn't have a batch dim because in_dim[1] was None 2025-10-10T01:45:05.3853438Z 2025-10-10T01:45:05.3853625Z If the input is a Python struct, ``in_dims`` must be a tuple containing a struct 2025-10-10T01:45:05.3853718Z matching the shape of the input: 2025-10-10T01:45:05.3853783Z 2025-10-10T01:45:05.3853899Z >>> f = lambda dict: torch.dot(dict["x"], dict["y"]) 2025-10-10T01:45:05.3853995Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-10-10T01:45:05.3854084Z >>> input = {"x": x, "y": y} 2025-10-10T01:45:05.3854231Z >>> batched_dot = torch.vmap(f, in_dims=({"x": 0, "y": None},)) 2025-10-10T01:45:05.3854316Z >>> batched_dot(input) 2025-10-10T01:45:05.3854384Z 2025-10-10T01:45:05.3854608Z By default, the output is batched along the first dimension. However, it can be batched 2025-10-10T01:45:05.3854712Z along any dimension by using ``out_dims`` 2025-10-10T01:45:05.3854782Z 2025-10-10T01:45:05.3854859Z >>> f = lambda x: x**2 2025-10-10T01:45:05.3854950Z >>> x = torch.randn(2, 5) 2025-10-10T01:45:05.3855051Z >>> batched_pow = torch.vmap(f, out_dims=1) 2025-10-10T01:45:05.3855134Z >>> batched_pow(x) # [5, 2] 2025-10-10T01:45:05.3855200Z 2025-10-10T01:45:05.3855433Z For any function that uses kwargs, the returned function will not batch the kwargs but will 2025-10-10T01:45:05.3855506Z accept kwargs 2025-10-10T01:45:05.3855574Z 2025-10-10T01:45:05.3855655Z >>> x = torch.randn([2, 5]) 2025-10-10T01:45:05.3855745Z >>> def fn(x, scale=4.): 2025-10-10T01:45:05.3855823Z >>> return x * scale 2025-10-10T01:45:05.3855892Z >>> 2025-10-10T01:45:05.3855983Z >>> batched_pow = torch.vmap(fn) 2025-10-10T01:45:05.3856101Z >>> assert torch.allclose(batched_pow(x), x * 4) 2025-10-10T01:45:05.3856281Z >>> batched_pow(x, scale=x) # scale is not batched, output has shape [2, 2, 5] 2025-10-10T01:45:05.3856349Z 2025-10-10T01:45:05.3856421Z .. note:: 2025-10-10T01:45:05.3856609Z vmap does not provide general autobatching or handle variable-length 2025-10-10T01:45:05.3856694Z sequences out of the box. 2025-10-10T01:45:05.3856762Z 2025-10-10T01:45:05.3857176Z Original Error: IndentationError('expected an indented block after function definition on line 4', ('', 5, 1, '_._ = None\n', 5, 2)) 2025-10-10T01:45:05.3857245Z 2025-10-10T01:45:05.3857313Z _._ = None 2025-10-10T01:45:05.3857382Z ^ 2025-10-10T01:45:05.3857603Z warnings.warn(msg) 2025-10-10T01:45:05.3857676Z 2025-10-10T01:45:05.3857833Z --- Parse Warning: 13 / 18 --- 2025-10-10T01:45:05.3858500Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=grad in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=306. 2025-10-10T01:45:05.3858702Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3859015Z ``grad`` operator helps computing gradients of ``func`` with respect to the 2025-10-10T01:45:05.3859177Z input(s) specified by ``argnums``. This operator can be nested to 2025-10-10T01:45:05.3859270Z compute higher-order gradients. 2025-10-10T01:45:05.3859340Z 2025-10-10T01:45:05.3859407Z Args: 2025-10-10T01:45:05.3859579Z func (Callable): A Python function that takes one or more arguments. 2025-10-10T01:45:05.3859805Z Must return a single-element Tensor. If specified ``has_aux`` equals ``True``, 2025-10-10T01:45:05.3860021Z function can return a tuple of single-element Tensor and other auxiliary objects: 2025-10-10T01:45:05.3860103Z ``(output, aux)``. 2025-10-10T01:45:05.3860328Z argnums (int or Tuple[int]): Specifies arguments to compute gradients with respect to. 2025-10-10T01:45:05.3860494Z ``argnums`` can be single integer or tuple of integers. Default: 0. 2025-10-10T01:45:05.3860682Z has_aux (bool): Flag indicating that ``func`` returns a tensor and other 2025-10-10T01:45:05.3860822Z auxiliary objects: ``(output, aux)``. Default: False. 2025-10-10T01:45:05.3860889Z 2025-10-10T01:45:05.3860960Z Returns: 2025-10-10T01:45:05.3861186Z Function to compute gradients with respect to its inputs. By default, the output of 2025-10-10T01:45:05.3861374Z the function is the gradient tensor(s) with respect to the first argument. 2025-10-10T01:45:05.3861599Z If specified ``has_aux`` equals ``True``, tuple of gradients and output auxiliary objects 2025-10-10T01:45:05.3861794Z is returned. If ``argnums`` is a tuple of integers, a tuple of output gradients with 2025-10-10T01:45:05.3861917Z respect to each ``argnums`` value is returned. 2025-10-10T01:45:05.3861985Z 2025-10-10T01:45:05.3862072Z Example of using ``grad``: 2025-10-10T01:45:05.3862138Z 2025-10-10T01:45:05.3862232Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3862323Z >>> from torch.func import grad 2025-10-10T01:45:05.3862411Z >>> x = torch.randn([]) 2025-10-10T01:45:05.3862512Z >>> cos_x = grad(lambda x: torch.sin(x))(x) 2025-10-10T01:45:05.3862618Z >>> assert torch.allclose(cos_x, x.cos()) 2025-10-10T01:45:05.3862687Z >>> 2025-10-10T01:45:05.3862779Z >>> # Second-order gradients 2025-10-10T01:45:05.3862904Z >>> neg_sin_x = grad(grad(lambda x: torch.sin(x)))(x) 2025-10-10T01:45:05.3863018Z >>> assert torch.allclose(neg_sin_x, -x.sin()) 2025-10-10T01:45:05.3863083Z 2025-10-10T01:45:05.3863290Z When composed with ``vmap``, ``grad`` can be used to compute per-sample-gradients: 2025-10-10T01:45:05.3863355Z 2025-10-10T01:45:05.3863441Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3863544Z >>> from torch.func import grad, vmap 2025-10-10T01:45:05.3863645Z >>> batch_size, feature_size = 3, 5 2025-10-10T01:45:05.3863718Z >>> 2025-10-10T01:45:05.3863819Z >>> def model(weights, feature_vec): 2025-10-10T01:45:05.3863924Z >>> # Very simple linear model with activation 2025-10-10T01:45:05.3864022Z >>> assert feature_vec.dim() == 1 2025-10-10T01:45:05.3864125Z >>> return feature_vec.dot(weights).relu() 2025-10-10T01:45:05.3864194Z >>> 2025-10-10T01:45:05.3864448Z >>> def compute_loss(weights, example, target): 2025-10-10T01:45:05.3864549Z >>> y = model(weights, example) 2025-10-10T01:45:05.3864661Z >>> return ((y - target) ** 2).mean() # MSELoss 2025-10-10T01:45:05.3864731Z >>> 2025-10-10T01:45:05.3864874Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-10-10T01:45:05.3864996Z >>> examples = torch.randn(batch_size, feature_size) 2025-10-10T01:45:05.3865090Z >>> targets = torch.randn(batch_size) 2025-10-10T01:45:05.3865371Z >>> inputs = (weights, examples, targets) 2025-10-10T01:45:05.3865558Z >>> grad_weight_per_example = vmap(grad(compute_loss), in_dims=(None, 0, 0))( 2025-10-10T01:45:05.3865638Z ... *inputs 2025-10-10T01:45:05.3865705Z ... ) 2025-10-10T01:45:05.3865773Z 2025-10-10T01:45:05.3865916Z Example of using ``grad`` with ``has_aux`` and ``argnums``: 2025-10-10T01:45:05.3865984Z 2025-10-10T01:45:05.3866072Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3866167Z >>> from torch.func import grad 2025-10-10T01:45:05.3866261Z >>> def my_loss_func(y, y_pred): 2025-10-10T01:45:05.3866374Z >>> loss_per_sample = (0.5 * y_pred - y) ** 2 2025-10-10T01:45:05.3866475Z >>> loss = loss_per_sample.mean() 2025-10-10T01:45:05.3866582Z >>> return loss, (y_pred, loss_per_sample) 2025-10-10T01:45:05.3866648Z >>> 2025-10-10T01:45:05.3866775Z >>> fn = grad(my_loss_func, argnums=(0, 1), has_aux=True) 2025-10-10T01:45:05.3866871Z >>> y_true = torch.rand(4) 2025-10-10T01:45:05.3866985Z >>> y_preds = torch.rand(4, requires_grad=True) 2025-10-10T01:45:05.3867071Z >>> out = fn(y_true, y_preds) 2025-10-10T01:45:05.3867275Z >>> # > output is ((grads w.r.t y_true, grads w.r.t y_preds), (y_pred, loss_per_sample)) 2025-10-10T01:45:05.3867343Z 2025-10-10T01:45:05.3867412Z .. note:: 2025-10-10T01:45:05.3867562Z Using PyTorch ``torch.no_grad`` together with ``grad``. 2025-10-10T01:45:05.3867630Z 2025-10-10T01:45:05.3867755Z Case 1: Using ``torch.no_grad`` inside a function: 2025-10-10T01:45:05.3867822Z 2025-10-10T01:45:05.3867912Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3867986Z >>> def f(x): 2025-10-10T01:45:05.3868086Z >>> with torch.no_grad(): 2025-10-10T01:45:05.3868165Z >>> c = x ** 2 2025-10-10T01:45:05.3868258Z >>> return x - c 2025-10-10T01:45:05.3868327Z 2025-10-10T01:45:05.3868495Z In this case, ``grad(f)(x)`` will respect the inner ``torch.no_grad``. 2025-10-10T01:45:05.3868562Z 2025-10-10T01:45:05.3868715Z Case 2: Using ``grad`` inside ``torch.no_grad`` context manager: 2025-10-10T01:45:05.3868780Z 2025-10-10T01:45:05.3868864Z >>> # xdoctest: +SKIP 2025-10-10T01:45:05.3868952Z >>> with torch.no_grad(): 2025-10-10T01:45:05.3869040Z >>> grad(f)(x) 2025-10-10T01:45:05.3869108Z 2025-10-10T01:45:05.3869290Z In this case, ``grad`` will respect the inner ``torch.no_grad``, but not the 2025-10-10T01:45:05.3869465Z outer one. This is because ``grad`` is a "function transform": its result 2025-10-10T01:45:05.3869647Z should not depend on the result of a context manager outside of ``f``. 2025-10-10T01:45:05.3869714Z 2025-10-10T01:45:05.3869791Z 2025-10-10T01:45:05.3870207Z Original Error: IndentationError('expected an indented block after function definition on line 5', ('', 6, 1, '_._ = None\n', 6, 2)) 2025-10-10T01:45:05.3870279Z 2025-10-10T01:45:05.3870348Z _._ = None 2025-10-10T01:45:05.3870420Z ^ 2025-10-10T01:45:05.3870502Z warnings.warn(msg) 2025-10-10T01:45:05.3870571Z 2025-10-10T01:45:05.3870721Z --- Parse Warning: 14 / 18 --- 2025-10-10T01:45:05.3871616Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=register_parametrization in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrize.py line=438. 2025-10-10T01:45:05.3871822Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3871953Z Register a parametrization to a tensor in a module. 2025-10-10T01:45:05.3872019Z 2025-10-10T01:45:05.3872244Z Assume that ``tensor_name="weight"`` for simplicity. When accessing ``module.weight``, 2025-10-10T01:45:05.3872594Z the module will return the parametrized version ``parametrization(module.weight)``. 2025-10-10T01:45:05.3872810Z If the original tensor requires a gradient, the backward pass will differentiate 2025-10-10T01:45:05.3873041Z through :attr:`parametrization`, and the optimizer will update the tensor accordingly. 2025-10-10T01:45:05.3873113Z 2025-10-10T01:45:05.3873370Z The first time that a module registers a parametrization, this function will add an attribute 2025-10-10T01:45:05.3873571Z ``parametrizations`` to the module of type :class:`~ParametrizationList`. 2025-10-10T01:45:05.3873639Z 2025-10-10T01:45:05.3873841Z The list of parametrizations on the tensor ``weight`` will be accessible under 2025-10-10T01:45:05.3873949Z ``module.parametrizations.weight``. 2025-10-10T01:45:05.3874018Z 2025-10-10T01:45:05.3874274Z The original tensor will be accessible under 2025-10-10T01:45:05.3874411Z ``module.parametrizations.weight.original``. 2025-10-10T01:45:05.3874477Z 2025-10-10T01:45:05.3874689Z Parametrizations may be concatenated by registering several parametrizations 2025-10-10T01:45:05.3874776Z on the same attribute. 2025-10-10T01:45:05.3874847Z 2025-10-10T01:45:05.3875038Z The training mode of a registered parametrization is updated on registration 2025-10-10T01:45:05.3875162Z to match the training mode of the host module 2025-10-10T01:45:05.3875229Z 2025-10-10T01:45:05.3875478Z Parametrized parameters and buffers have an inbuilt caching system that can be activated 2025-10-10T01:45:05.3875580Z using the context manager :func:`cached`. 2025-10-10T01:45:05.3875652Z 2025-10-10T01:45:05.3875846Z A :attr:`parametrization` may optionally implement a method with signature 2025-10-10T01:45:05.3875920Z 2025-10-10T01:45:05.3876007Z .. code-block:: python 2025-10-10T01:45:05.3876087Z 2025-10-10T01:45:05.3876262Z def right_inverse(self, X: Tensor) -> Union[Tensor, Sequence[Tensor]] 2025-10-10T01:45:05.3876334Z 2025-10-10T01:45:05.3876547Z This method is called on the unparametrized tensor when the first parametrization 2025-10-10T01:45:05.3876720Z is registered to compute the initial value of the original tensor. 2025-10-10T01:45:05.3876965Z If this method is not implemented, the original tensor will be just the unparametrized tensor. 2025-10-10T01:45:05.3877038Z 2025-10-10T01:45:05.3877288Z If all the parametrizations registered on a tensor implement `right_inverse` it is possible 2025-10-10T01:45:05.3877520Z to initialize a parametrized tensor by assigning to it, as shown in the example below. 2025-10-10T01:45:05.3877586Z 2025-10-10T01:45:05.3877772Z It is possible for the first parametrization to depend on several inputs. 2025-10-10T01:45:05.3877965Z This may be implemented returning a tuple of tensors from ``right_inverse`` 2025-10-10T01:45:05.3878167Z (see the example implementation of a ``RankOne`` parametrization below). 2025-10-10T01:45:05.3878237Z 2025-10-10T01:45:05.3878510Z In this case, the unconstrained tensors are also located under ``module.parametrizations.weight`` 2025-10-10T01:45:05.3878625Z with names ``original0``, ``original1``,... 2025-10-10T01:45:05.3878689Z 2025-10-10T01:45:05.3878767Z .. note:: 2025-10-10T01:45:05.3879008Z 2025-10-10T01:45:05.3879237Z If unsafe=False (default) both the forward and right_inverse methods will be called 2025-10-10T01:45:05.3879359Z once to perform a number of consistency checks. 2025-10-10T01:45:05.3879577Z If unsafe=True, then right_inverse will be called if the tensor is not parametrized, 2025-10-10T01:45:05.3879677Z and nothing will be called otherwise. 2025-10-10T01:45:05.3879746Z 2025-10-10T01:45:05.3879821Z .. note:: 2025-10-10T01:45:05.3880046Z 2025-10-10T01:45:05.3880220Z In most situations, ``right_inverse`` will be a function such that 2025-10-10T01:45:05.3880325Z ``forward(right_inverse(X)) == X`` (see 2025-10-10T01:45:05.3880562Z `right inverse `_). 2025-10-10T01:45:05.3880771Z Sometimes, when the parametrization is not surjective, it may be reasonable 2025-10-10T01:45:05.3880862Z to relax this. 2025-10-10T01:45:05.3880933Z 2025-10-10T01:45:05.3881008Z .. warning:: 2025-10-10T01:45:05.3881076Z 2025-10-10T01:45:05.3881296Z If a parametrization depends on several inputs, :func:`~register_parametrization` 2025-10-10T01:45:05.3881513Z will register a number of new parameters. If such parametrization is registered 2025-10-10T01:45:05.3881731Z after the optimizer is created, these new parameters will need to be added manually 2025-10-10T01:45:05.3881907Z to the optimizer. See :meth:`torch.Optimizer.add_param_group`. 2025-10-10T01:45:05.3881975Z 2025-10-10T01:45:05.3882048Z Args: 2025-10-10T01:45:05.3882222Z module (nn.Module): module on which to register the parametrization 2025-10-10T01:45:05.3882399Z tensor_name (str): name of the parameter or buffer on which to register 2025-10-10T01:45:05.3882491Z the parametrization 2025-10-10T01:45:05.3882670Z parametrization (nn.Module): the parametrization to register 2025-10-10T01:45:05.3882748Z Keyword args: 2025-10-10T01:45:05.3882931Z unsafe (bool): a boolean flag that denotes whether the parametrization 2025-10-10T01:45:05.3883091Z may change the dtype and shape of the tensor. Default: `False` 2025-10-10T01:45:05.3883308Z Warning: the parametrization is not checked for consistency upon registration. 2025-10-10T01:45:05.3883409Z Enable this flag at your own risk. 2025-10-10T01:45:05.3883489Z 2025-10-10T01:45:05.3883560Z Raises: 2025-10-10T01:45:05.3883798Z ValueError: if the module does not have a parameter or a buffer named :attr:`tensor_name` 2025-10-10T01:45:05.3883863Z 2025-10-10T01:45:05.3883940Z Examples: 2025-10-10T01:45:05.3884064Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_LAPACK) 2025-10-10T01:45:05.3884149Z >>> import torch 2025-10-10T01:45:05.3884237Z >>> import torch.nn as nn 2025-10-10T01:45:05.3884361Z >>> import torch.nn.utils.parametrize as P 2025-10-10T01:45:05.3884428Z >>> 2025-10-10T01:45:05.3884525Z >>> class Symmetric(nn.Module): 2025-10-10T01:45:05.3884616Z >>> def forward(self, X): 2025-10-10T01:45:05.3884769Z >>> return X.triu() + X.triu(1).T # Return a symmetric matrix 2025-10-10T01:45:05.3884840Z >>> 2025-10-10T01:45:05.3884940Z >>> def right_inverse(self, A): 2025-10-10T01:45:05.3885028Z >>> return A.triu() 2025-10-10T01:45:05.3885103Z >>> 2025-10-10T01:45:05.3885187Z >>> m = nn.Linear(5, 5) 2025-10-10T01:45:05.3885335Z >>> P.register_parametrization(m, "weight", Symmetric()) 2025-10-10T01:45:05.3885532Z >>> print(torch.allclose(m.weight, m.weight.T)) # m.weight is now symmetric 2025-10-10T01:45:05.3885606Z True 2025-10-10T01:45:05.3885691Z >>> A = torch.rand(5, 5) 2025-10-10T01:45:05.3885927Z >>> A = A + A.T # A is now symmetric 2025-10-10T01:45:05.3886095Z >>> m.weight = A # Initialize the weight to be the symmetric matrix A 2025-10-10T01:45:05.3886207Z >>> print(torch.allclose(m.weight, A)) 2025-10-10T01:45:05.3886279Z True 2025-10-10T01:45:05.3886348Z 2025-10-10T01:45:05.3886438Z >>> class RankOne(nn.Module): 2025-10-10T01:45:05.3886541Z >>> def forward(self, x, y): 2025-10-10T01:45:05.3886789Z >>> # Form a rank 1 matrix multiplying two vectors 2025-10-10T01:45:05.3886904Z >>> return x.unsqueeze(-1) @ y.unsqueeze(-2) 2025-10-10T01:45:05.3886970Z >>> 2025-10-10T01:45:05.3887069Z >>> def right_inverse(self, Z): 2025-10-10T01:45:05.3887172Z >>> # Project Z onto the rank 1 matrices 2025-10-10T01:45:05.3887307Z >>> U, S, Vh = torch.linalg.svd(Z, full_matrices=False) 2025-10-10T01:45:05.3887411Z >>> # Return rescaled singular vectors 2025-10-10T01:45:05.3887518Z >>> s0_sqrt = S[0].sqrt().unsqueeze(-1) 2025-10-10T01:45:05.3887642Z >>> return U[..., :, 0] * s0_sqrt, Vh[..., 0, :] * s0_sqrt 2025-10-10T01:45:05.3887713Z >>> 2025-10-10T01:45:05.3887840Z >>> linear_rank_one = P.register_parametrization( 2025-10-10T01:45:05.3887952Z ... nn.Linear(4, 4), "weight", RankOne() 2025-10-10T01:45:05.3888019Z ... ) 2025-10-10T01:45:05.3888200Z >>> print(torch.linalg.matrix_rank(linear_rank_one.weight).item()) 2025-10-10T01:45:05.3888268Z 1 2025-10-10T01:45:05.3888343Z 2025-10-10T01:45:05.3888412Z 2025-10-10T01:45:05.3888839Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 3, 0, '_._ = None\n', 3, -1)) 2025-10-10T01:45:05.3888906Z 2025-10-10T01:45:05.3888979Z _._ = None 2025-10-10T01:45:05.3889046Z ^ 2025-10-10T01:45:05.3889143Z warnings.warn(msg) 2025-10-10T01:45:05.3889208Z 2025-10-10T01:45:05.3889373Z --- Parse Warning: 15 / 18 --- 2025-10-10T01:45:05.3890134Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=DeviceMesh.__getitem__ in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py line=726. 2025-10-10T01:45:05.3890339Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3890415Z 2025-10-10T01:45:05.3890629Z Slice the current DeviceMesh based on the mesh_dim_names given to create a submesh. 2025-10-10T01:45:05.3890851Z The submesh created consists of the dimensions and the communicators indicated by 2025-10-10T01:45:05.3890928Z ``mesh_dim_names`` 2025-10-10T01:45:05.3891000Z 2025-10-10T01:45:05.3891069Z Args: 2025-10-10T01:45:05.3891268Z mesh_dim_names (Union[str, Tuple[str]]): the name or the tuple of names of the 2025-10-10T01:45:05.3891419Z mesh dimension of the DeviceMesh to create the submesh for. 2025-10-10T01:45:05.3891491Z Returns: 2025-10-10T01:45:05.3891579Z A :class:`DeviceMesh` object 2025-10-10T01:45:05.3891649Z 2025-10-10T01:45:05.3891879Z The following program runs on each process/rank in an SPMD manner in a world size of 8. 2025-10-10T01:45:05.3891966Z In the first example: 2025-10-10T01:45:05.3892170Z Calling mesh_2d["tp"] on rank 0, 1, 2, 3 returns a 1D submesh of DeviceMesh:([0, 1, 2, 3]). 2025-10-10T01:45:05.3892380Z Calling mesh_2d["tp"] on rank 4, 5, 6, 7 returns a 1D submesh of DeviceMesh:([4, 5, 6, 7]). 2025-10-10T01:45:05.3892556Z Calling mesh_2d["dp"] on rank 0, 4 returns a 1D submesh of DeviceMesh:([0, 4]). 2025-10-10T01:45:05.3892732Z Calling mesh_2d["dp"] on rank 1, 5 returns a 1D submesh of DeviceMesh:([1, 5]). 2025-10-10T01:45:05.3893064Z Calling mesh_2d["dp"] on rank 2, 6 returns a 1D submesh of DeviceMesh:([2, 6]). 2025-10-10T01:45:05.3893241Z Calling mesh_2d["dp"] on rank 3, 7 returns a 1D submesh of DeviceMesh:([3, 7]). 2025-10-10T01:45:05.3893307Z 2025-10-10T01:45:05.3893395Z In the second example: 2025-10-10T01:45:05.3893607Z Calling mesh_3d["dp", "cp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 1], [4, 5]]). 2025-10-10T01:45:05.3893815Z Calling mesh_3d["dp", "cp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 3], [6, 7]]). 2025-10-10T01:45:05.3894143Z Calling mesh_3d["cp", "dp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 4], [1, 5]]). 2025-10-10T01:45:05.3894347Z Calling mesh_3d["cp", "dp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 6], [3, 7]]). 2025-10-10T01:45:05.3894412Z 2025-10-10T01:45:05.3894488Z Example:: 2025-10-10T01:45:05.3894554Z 2025-10-10T01:45:05.3894648Z >>> # xdoctest: +SKIP("no rank") 2025-10-10T01:45:05.3894797Z >>> from torch.distributed.device_mesh import DeviceMesh 2025-10-10T01:45:05.3894869Z >>> 2025-10-10T01:45:05.3895028Z >>> # Initialize a 2D device mesh as (2, 4) to represent the topology 2025-10-10T01:45:05.3895147Z >>> # of cross-host(dim 0), and within-host (dim 1). 2025-10-10T01:45:05.3895351Z >>> mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-10-10T01:45:05.3895440Z >>> tp_mesh = mesh_2d["tp"] 2025-10-10T01:45:05.3895522Z >>> dp_mesh = mesh_2d["dp"] 2025-10-10T01:45:05.3895599Z >>> 2025-10-10T01:45:05.3895679Z >>> # Initialize a 3D mesh. 2025-10-10T01:45:05.3895902Z >>> mesh_3d = init_device_mesh(device_type="cuda", (2,2,2), mesh_dim_names=("dp", "pp", "cp")) 2025-10-10T01:45:05.3896142Z >>> # The order of the mesh_dim_names provided deteremines the order of dimensions in the submesh. 2025-10-10T01:45:05.3896240Z >>> dp_cp_mesh = mesh_3d["dp", "cp"] 2025-10-10T01:45:05.3896337Z >>> cp_dp_mesh = mesh_3d["cp", "dp"] 2025-10-10T01:45:05.3896407Z 2025-10-10T01:45:05.3896940Z Original Error: SyntaxError('positional argument follows keyword argument', ('', 6, 82, 'mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp"))\n', 6, 83)) 2025-10-10T01:45:05.3897011Z 2025-10-10T01:45:05.3897209Z mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-10-10T01:45:05.3897303Z ^ 2025-10-10T01:45:05.3897395Z warnings.warn(msg) 2025-10-10T01:45:05.3897461Z 2025-10-10T01:45:05.3897616Z --- Parse Warning: 16 / 18 --- 2025-10-10T01:45:05.3898359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=FullStateDictConfig in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py line=295. 2025-10-10T01:45:05.3898566Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3898634Z 2025-10-10T01:45:05.3898802Z ``FullStateDictConfig`` is a config class meant to be used with 2025-10-10T01:45:05.3898960Z ``StateDictType.FULL_STATE_DICT``. We recommend enabling both 2025-10-10T01:45:05.3899126Z ``offload_to_cpu=True`` and ``rank0_only=True`` when saving full state 2025-10-10T01:45:05.3899302Z dicts to save GPU memory and CPU memory, respectively. This config class 2025-10-10T01:45:05.3899468Z is meant to be used via the :func:`state_dict_type` context manager as 2025-10-10T01:45:05.3899535Z follows: 2025-10-10T01:45:05.3899602Z 2025-10-10T01:45:05.3899706Z >>> # xdoctest: +SKIP("undefined variables") 2025-10-10T01:45:05.3899902Z >>> from torch.distributed.fsdp import FullyShardedDataParallel as FSDP 2025-10-10T01:45:05.3900002Z >>> fsdp = FSDP(model, auto_wrap_policy=...) 2025-10-10T01:45:05.3900308Z >>> cfg = FullStateDictConfig(offload_to_cpu=True, rank0_only=True) 2025-10-10T01:45:05.3900489Z >>> with FSDP.state_dict_type(fsdp, StateDictType.FULL_STATE_DICT, cfg): 2025-10-10T01:45:05.3900585Z >>> state = fsdp.state_dict() 2025-10-10T01:45:05.3900754Z >>> # `state` will be empty on non rank 0 and contain CPU tensors on rank 0. 2025-10-10T01:45:05.3900945Z >>> # To reload checkpoint for inference, finetuning, transfer learning, etc: 2025-10-10T01:45:05.3901131Z >>> model = model_fn() # Initialize model in preparation for wrapping with FSDP 2025-10-10T01:45:05.3901348Z >>> if dist.get_rank() == 0: 2025-10-10T01:45:05.3901498Z >>> # Load checkpoint only on rank 0 to avoid memory redundancy 2025-10-10T01:45:05.3901617Z >>> state_dict = torch.load("my_checkpoint.pt") 2025-10-10T01:45:05.3901717Z >>> model.load_state_dict(state_dict) 2025-10-10T01:45:05.3901904Z >>> # All ranks initialize FSDP module as usual. `sync_module_states` argument 2025-10-10T01:45:05.3902100Z >>> # communicates loaded checkpoint states from rank 0 to rest of the world. 2025-10-10T01:45:05.3902181Z >>> fsdp = FSDP( 2025-10-10T01:45:05.3902252Z ... model, 2025-10-10T01:45:05.3902365Z ... device_id=torch.cuda.current_device(), 2025-10-10T01:45:05.3902452Z ... auto_wrap_policy=..., 2025-10-10T01:45:05.3902541Z ... sync_module_states=True, 2025-10-10T01:45:05.3902608Z ... ) 2025-10-10T01:45:05.3902795Z >>> # After this point, all ranks have FSDP model with loaded checkpoint. 2025-10-10T01:45:05.3902859Z 2025-10-10T01:45:05.3902933Z Attributes: 2025-10-10T01:45:05.3903093Z rank0_only (bool): If ``True``, then only rank 0 saves the full state 2025-10-10T01:45:05.3903257Z dict, and nonzero ranks save an empty dict. If ``False``, then all 2025-10-10T01:45:05.3903382Z ranks save the full state dict. (Default: ``False``) 2025-10-10T01:45:05.3903450Z 2025-10-10T01:45:05.3903849Z Original Error: IndentationError("expected an indented block after 'if' statement on line 10", ('', 11, 1, '_._ = None\n', 11, 2)) 2025-10-10T01:45:05.3903919Z 2025-10-10T01:45:05.3903987Z _._ = None 2025-10-10T01:45:05.3904056Z ^ 2025-10-10T01:45:05.3904134Z warnings.warn(msg) 2025-10-10T01:45:05.3904203Z 2025-10-10T01:45:05.3904352Z --- Parse Warning: 17 / 18 --- 2025-10-10T01:45:05.3905106Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=SavePlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=122. 2025-10-10T01:45:05.3905311Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3905376Z 2025-10-10T01:45:05.3905609Z Abstract class defining the protocol used by save_state_dict to plan the save process. 2025-10-10T01:45:05.3905673Z 2025-10-10T01:45:05.3905920Z SavePlanners are stateful objects that can be used to customize the whole save process. 2025-10-10T01:45:05.3905986Z 2025-10-10T01:45:05.3906206Z SavePlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-10-10T01:45:05.3906300Z will be visible to the whole process. 2025-10-10T01:45:05.3906368Z 2025-10-10T01:45:05.3906590Z A planner subclass can expect the following sequence of calls during save_state_dict: 2025-10-10T01:45:05.3906661Z 2025-10-10T01:45:05.3906761Z 1) set_up_planner - called on all ranks. 2025-10-10T01:45:05.3906868Z Signals the start of a checkpoint save. 2025-10-10T01:45:05.3906933Z 2025-10-10T01:45:05.3907036Z 2) create_local_plan - called on all ranks. 2025-10-10T01:45:05.3907262Z Process the state_dict and produces a `SavePlan` that will be sent for global planning. 2025-10-10T01:45:05.3907331Z 2025-10-10T01:45:05.3907615Z 3) create_global_plan - called on the coordinator rank only. 2025-10-10T01:45:05.3907782Z Takes the SavePlan from all ranks and make any global decision. 2025-10-10T01:45:05.3907847Z 2025-10-10T01:45:05.3907940Z 4) finish_plan - called on all ranks. 2025-10-10T01:45:05.3908116Z This gives each rank a chance to adjust to global planning decisions. 2025-10-10T01:45:05.3908187Z 2025-10-10T01:45:05.3908316Z 5) resolve_data - called multiple times on each rank 2025-10-10T01:45:05.3908483Z Lookups a value on the `state_dict` for the storage layer to write. 2025-10-10T01:45:05.3908674Z 2025-10-10T01:45:05.3908917Z Users are recommended to extend DefaultSavePlanner instead of this interface directly as 2025-10-10T01:45:05.3909062Z most changes can be expressed by changes in a single method. 2025-10-10T01:45:05.3909130Z 2025-10-10T01:45:05.3909229Z There are 3 usual patterns of extension: 2025-10-10T01:45:05.3909295Z 2025-10-10T01:45:05.3909504Z Rewriting state_dict. This is the simplest way to extend the save process as it 2025-10-10T01:45:05.3909689Z doesn't requite understanding the intrincacies of how SavePlan works: 2025-10-10T01:45:05.3909754Z 2025-10-10T01:45:05.3909851Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3909969Z >>> class RenamePlanner(DefaultSavePlanner): 2025-10-10T01:45:05.3910055Z >>> def set_up_planner( 2025-10-10T01:45:05.3910125Z >>> self, 2025-10-10T01:45:05.3910217Z >>> state_dict: STATE_DICT_TYPE, 2025-10-10T01:45:05.3910326Z >>> storage_meta: Optional[StorageMeta], 2025-10-10T01:45:05.3910414Z >>> is_coordinator: bool, 2025-10-10T01:45:05.3910486Z >>> ) -> None: 2025-10-10T01:45:05.3910581Z >>> # prefix all keys with `foo_`` 2025-10-10T01:45:05.3910817Z >>> super().set_up_planner({"foo_" + k: v for k, v in state_dict.items()}, storage_meta, is_coordinator) 2025-10-10T01:45:05.3910885Z 2025-10-10T01:45:05.3911158Z Modifying local plan and lookup in tandem. This is useful when fine control of how data is persisted 2025-10-10T01:45:05.3911226Z 2025-10-10T01:45:05.3911322Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3911429Z >>> class FP16Planner(DefaultSavePlanner): 2025-10-10T01:45:05.3911518Z >>> def create_local_plan(self): 2025-10-10T01:45:05.3911619Z >>> plan = super().create_local_plan() 2025-10-10T01:45:05.3911702Z >>> for p in plan: 2025-10-10T01:45:05.3911801Z >>> if p.tensor_data is not None: 2025-10-10T01:45:05.3911944Z >>> p.tensor_data.properties.dtype = torch.float16 2025-10-10T01:45:05.3912026Z >>> return plan 2025-10-10T01:45:05.3912092Z >>> 2025-10-10T01:45:05.3912189Z >>> def resolve_data(self, write_item): 2025-10-10T01:45:05.3912295Z >>> item = super().resolve_data(write_item) 2025-10-10T01:45:05.3912525Z >>> return item if write_item.type == WriteItemType.BYTE_IO else item.to(torch.float16) 2025-10-10T01:45:05.3912590Z 2025-10-10T01:45:05.3912866Z Using the global planning step to make central decisions that can't be made individually by each rank 2025-10-10T01:45:05.3912932Z 2025-10-10T01:45:05.3913029Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3913122Z >>> from itertools import zip_longest 2025-10-10T01:45:05.3913218Z >>> from dataclasses import replace 2025-10-10T01:45:05.3913355Z >>> class DDPLoadBalancingPlanner(DefaultSavePlanner): 2025-10-10T01:45:05.3913585Z >>> # This uses the default local plan behavior of having all non-sharded writes in rank 0 2025-10-10T01:45:05.3913691Z >>> # This sample doesn't handle ShardedTensors 2025-10-10T01:45:05.3913796Z >>> def create_global_plan(self, all_plans): 2025-10-10T01:45:05.3913918Z >>> iters = [iter(all_plans[0].items)] * len(all_plans) 2025-10-10T01:45:05.3914005Z >>> items_per_rank = [ 2025-10-10T01:45:05.3914339Z >>> [item for item in items if item is not None] 2025-10-10T01:45:05.3914475Z >>> for items in zip(*zip_longest(*iters), strict=True) 2025-10-10T01:45:05.3914542Z >>> ] 2025-10-10T01:45:05.3914624Z >>> all_plans = [ 2025-10-10T01:45:05.3914718Z >>> replace(plan, items=items) 2025-10-10T01:45:05.3914875Z >>> for plan, items in zip(all_plans, items_per_rank, strict=True) 2025-10-10T01:45:05.3914943Z >>> ] 2025-10-10T01:45:05.3915061Z >>> return super().create_global_plan(all_plans) 2025-10-10T01:45:05.3915276Z 2025-10-10T01:45:05.3915494Z Finally, some planners need to save additional metadata in the checkpoint, this is 2025-10-10T01:45:05.3915707Z accomplished by having each rank contribute their data items in the local plan and 2025-10-10T01:45:05.3915806Z the global planner aggregate them: 2025-10-10T01:45:05.3915871Z 2025-10-10T01:45:05.3915966Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3916106Z >>> class SaveExtraDataPlanner(DefaultSavePlanner): 2025-10-10T01:45:05.3920861Z >>> def create_local_plan(self) -> SavePlan: 2025-10-10T01:45:05.3920984Z >>> plan = super().create_local_plan() 2025-10-10T01:45:05.3921134Z >>> return replace(plan, planner_data="per-rank-data") 2025-10-10T01:45:05.3921204Z >>> 2025-10-10T01:45:05.3921463Z >>> def create_global_plan(self, all_plans: List[SavePlan]) -> Tuple[List[SavePlan], Metadata]: 2025-10-10T01:45:05.3921641Z >>> global_plan, metadata = super().create_global_plan(all_plans) 2025-10-10T01:45:05.3921777Z >>> merged_data = [p.planner_data for p in global_plan] 2025-10-10T01:45:05.3921921Z >>> metadata = replace(metadata, planner_data=merged_data) 2025-10-10T01:45:05.3922017Z >>> return global_plan, metadata 2025-10-10T01:45:05.3922088Z 2025-10-10T01:45:05.3922520Z Original Error: IndentationError('expected an indented block after function definition on line 3', ('', 9, 0, '_._ = None\n', 9, -1)) 2025-10-10T01:45:05.3922588Z 2025-10-10T01:45:05.3922657Z _._ = None 2025-10-10T01:45:05.3922726Z ^ 2025-10-10T01:45:05.3922809Z warnings.warn(msg) 2025-10-10T01:45:05.3922877Z 2025-10-10T01:45:05.3923062Z --- Parse Warning: 18 / 18 --- 2025-10-10T01:45:05.3923838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=LoadPlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=305. 2025-10-10T01:45:05.3924052Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-10-10T01:45:05.3924123Z 2025-10-10T01:45:05.3924357Z Abstract class defining the protocol used by load_state_dict to plan the load process. 2025-10-10T01:45:05.3924423Z 2025-10-10T01:45:05.3924647Z LoadPlanner are stateful objects that can be used to customize the whole load process. 2025-10-10T01:45:05.3924723Z 2025-10-10T01:45:05.3924946Z LoadPlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-10-10T01:45:05.3925047Z will be visible to the whole process. 2025-10-10T01:45:05.3925112Z 2025-10-10T01:45:05.3925339Z A planner subclass can expect the following sequence of calls during load_state_dict: 2025-10-10T01:45:05.3925407Z 2025-10-10T01:45:05.3925511Z 1) set_up_planner - called on all ranks. 2025-10-10T01:45:05.3925623Z Signals the start of loading a checkpoint. 2025-10-10T01:45:05.3925691Z 2025-10-10T01:45:05.3925800Z 2) create_local_plan - called on all ranks. 2025-10-10T01:45:05.3926035Z Process the state_dict and produces a `LoadPlan` that will be sent for global planning. 2025-10-10T01:45:05.3926100Z 2025-10-10T01:45:05.3926256Z 3) create_global_plan - called on the coordinator rank only. 2025-10-10T01:45:05.3926630Z Takes the LoadPlan from all ranks and make any global decision. 2025-10-10T01:45:05.3926702Z 2025-10-10T01:45:05.3926824Z 4) load_bytes - called multiple times on each rank 2025-10-10T01:45:05.3926965Z This is called once per non-tensor value in state_dict. 2025-10-10T01:45:05.3927030Z 2025-10-10T01:45:05.3927213Z 5) resolve_tensor and commit_tensor - called multiple times on each rank 2025-10-10T01:45:05.3927360Z They are called in pair for each Tensor value in state_dict. 2025-10-10T01:45:05.3927430Z 2025-10-10T01:45:05.3927800Z Users are recommended to extend DefaultLoadPlanner instead of this interface directly as 2025-10-10T01:45:05.3927948Z most changes can be expressed by changes in a single method. 2025-10-10T01:45:05.3928012Z 2025-10-10T01:45:05.3928121Z There are two usual patterns of extension: 2025-10-10T01:45:05.3928184Z 2025-10-10T01:45:05.3928390Z Rewriting state_dict. This is the simplest way to extend the load process as it 2025-10-10T01:45:05.3928607Z doesn't requite understanding the intrincacies of how LoadPlan works. We need 2025-10-10T01:45:05.3928793Z to keep a reference to the original state_dict as load happens in place so 2025-10-10T01:45:05.3928895Z we need to be able to perform it in place 2025-10-10T01:45:05.3928964Z 2025-10-10T01:45:05.3929061Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3929179Z >>> class RenamePlanner(DefaultLoadPlanner): 2025-10-10T01:45:05.3929264Z >>> def set_up_planner( 2025-10-10T01:45:05.3929345Z >>> self, 2025-10-10T01:45:05.3929440Z >>> state_dict: STATE_DICT_TYPE, 2025-10-10T01:45:05.3929529Z >>> metadata: Metadata, 2025-10-10T01:45:05.3929615Z >>> is_coordinator: bool, 2025-10-10T01:45:05.3929691Z >>> ) -> None: 2025-10-10T01:45:05.3929799Z >>> self.original_state_dict = state_dict 2025-10-10T01:45:05.3929947Z >>> state_dict = {"foo_" + k: v for k, v in state_dict.items()} 2025-10-10T01:45:05.3930015Z >>> 2025-10-10T01:45:05.3930122Z >>> if self.flatten_sharded_tensors: 2025-10-10T01:45:05.3930246Z >>> state_dict = _flatten_sharded_tensors(state_dict) 2025-10-10T01:45:05.3930314Z >>> 2025-10-10T01:45:05.3930403Z >>> if self.flatten_state_dict: 2025-10-10T01:45:05.3930553Z >>> state_dict, self.mappings = flatten_state_dict(state_dict) 2025-10-10T01:45:05.3930620Z >>> 2025-10-10T01:45:05.3930712Z >>> self.state_dict = state_dict 2025-10-10T01:45:05.3930807Z >>> self.metadata = metadata 2025-10-10T01:45:05.3930911Z >>> self.is_coordinator = is_coordinator 2025-10-10T01:45:05.3930975Z >>> 2025-10-10T01:45:05.3931080Z >>> def load_bytes(self, read_item, value): 2025-10-10T01:45:05.3931166Z >>> # Remove the "foo_" prefix 2025-10-10T01:45:05.3931431Z >>> self.original_state_dict[read_item.dest_index.fqn[4:]] = torch.load(value, weights_only=False) 2025-10-10T01:45:05.3931496Z 2025-10-10T01:45:05.3931570Z 2025-10-10T01:45:05.3931779Z Modifying resolve_tensor and commit_tensor to handle load time transformation. 2025-10-10T01:45:05.3931851Z 2025-10-10T01:45:05.3931945Z >>> # xdoctest: +SKIP("undefined vars") 2025-10-10T01:45:05.3932079Z >>> class MetaModelMaterialize(DefaultSavePlanner): 2025-10-10T01:45:05.3932175Z >>> def resolve_tensor(self, read_item): 2025-10-10T01:45:05.3932292Z >>> tensor = super().resolve_tensor(read_item) 2025-10-10T01:45:05.3932418Z >>> return torch.empty_like(tensor, device="cpu") 2025-10-10T01:45:05.3932487Z >>> 2025-10-10T01:45:05.3932593Z >>> def commit_tensor(self, read_item, tensor): 2025-10-10T01:45:05.3932732Z >>> self.state_dict[read_item.dest_index.fqn] = tensor 2025-10-10T01:45:05.3932797Z 2025-10-10T01:45:05.3933220Z Original Error: IndentationError('expected an indented block after function definition on line 22', ('', 23, 0, '_._ = None\n', 23, -1)) 2025-10-10T01:45:05.3933422Z 2025-10-10T01:45:05.3933498Z _._ = None 2025-10-10T01:45:05.3933563Z ^ 2025-10-10T01:45:05.3933649Z warnings.warn(msg) 2025-10-10T01:45:05.3933714Z 2025-10-10T01:45:05.3933804Z  2025-10-10T01:45:05.3933946Z === Found 9 run-time warnings === 2025-10-10T01:45:05.3934092Z --- Runtime Warning: 1 / 9 --- 2025-10-10T01:45:05.3934306Z example = 2025-10-10T01:45:05.3935490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py:1393: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /var/lib/jenkins/workspace/c10/core/TensorImpl.h:1971.) 2025-10-10T01:45:05.3935590Z return super().refine_names(names) 2025-10-10T01:45:05.3935655Z 2025-10-10T01:45:05.3935815Z --- Runtime Warning: 2 / 9 --- 2025-10-10T01:45:05.3936064Z example = 2025-10-10T01:45:05.3936558Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py:274: UserWarning: Warning only once for all operators, other operators may also be overridden. 2025-10-10T01:45:05.3936806Z Overriding a previously registered kernel for the same operator and the same dispatch key 2025-10-10T01:45:05.3936978Z operator: aten::div.Tensor(Tensor self, Tensor other) -> Tensor 2025-10-10T01:45:05.3937225Z registered at /var/lib/jenkins/workspace/build/aten/src/ATen/RegisterSchema.cpp:6 2025-10-10T01:45:05.3937307Z dispatch key: CPU 2025-10-10T01:45:05.3937650Z previous kernel: registered at /var/lib/jenkins/workspace/aten/src/ATen/LegacyBatchingRegistrations.cpp:1079 2025-10-10T01:45:05.3938431Z new kernel: registered at :1 (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/core/dispatch/OperatorEntry.cpp:208.) 2025-10-10T01:45:05.3938562Z impl_fn(self.ns, name.split("::")[-1], dispatch_key) 2025-10-10T01:45:05.3938631Z 2025-10-10T01:45:05.3938769Z --- Runtime Warning: 3 / 9 --- 2025-10-10T01:45:05.3938961Z example = 2025-10-10T01:45:05.3940416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py:117: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. We recommend specifying layout=torch.jagged when constructing a nested tensor, as this layout receives active development, has better operator coverage, and works with torch.compile. (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/NestedTensorImpl.cpp:178.) 2025-10-10T01:45:05.3940626Z return torch._nested_tensor_from_tensor_list(ts, dtype, None, device, None) 2025-10-10T01:45:05.3940696Z 2025-10-10T01:45:05.3940833Z --- Runtime Warning: 4 / 9 --- 2025-10-10T01:45:05.3941041Z example = 2025-10-10T01:45:05.3942304Z :1: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/SparseCsrTensorImpl.cpp:53.) 2025-10-10T01:45:05.3942378Z 2025-10-10T01:45:05.3942514Z --- Runtime Warning: 5 / 9 --- 2025-10-10T01:45:05.3942759Z example = 2025-10-10T01:45:05.3944107Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/const_fold.py:278: UserWarning: Attempted to insert a get_attr Node with no underlying reference in the owning GraphModule! Call GraphModule.add_submodule to add the necessary submodule, GraphModule.add_parameter to add the necessary Parameter, or nn.Module.register_buffer to add the necessary buffer 2025-10-10T01:45:05.3944250Z new_node = root_const_gm.graph.get_attr(in_node.target) 2025-10-10T01:45:05.3944320Z 2025-10-10T01:45:05.3944462Z --- Runtime Warning: 6 / 9 --- 2025-10-10T01:45:05.3944825Z example = 2025-10-10T01:45:05.3945681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:401: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-10-10T01:45:05.3945760Z warnings.warn( 2025-10-10T01:45:05.3945824Z 2025-10-10T01:45:05.3945970Z --- Runtime Warning: 7 / 9 --- 2025-10-10T01:45:05.3946229Z example = 2025-10-10T01:45:05.3947071Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:401: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-10-10T01:45:05.3947153Z warnings.warn( 2025-10-10T01:45:05.3947222Z 2025-10-10T01:45:05.3947355Z --- Runtime Warning: 8 / 9 --- 2025-10-10T01:45:05.3947576Z example = 2025-10-10T01:45:05.3948215Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. 2025-10-10T01:45:05.3948324Z WeightNorm.apply(module, name, dim) 2025-10-10T01:45:05.3948396Z 2025-10-10T01:45:05.3948532Z --- Runtime Warning: 9 / 9 --- 2025-10-10T01:45:05.3948778Z example = 2025-10-10T01:45:05.3949408Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. 2025-10-10T01:45:05.3949516Z WeightNorm.apply(module, name, dim) 2025-10-10T01:45:05.3949582Z 2025-10-10T01:45:05.3949830Z === 376 passed, 501 skipped, 27 warnings in 15.54 seconds === 2025-10-10T01:45:05.3949986Z Running test_autoload_enable 1/1 ... [2025-10-10 01:45:05.361413] 2025-10-10T01:45:05.7606642Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions 2025-10-10T01:45:09.1340310Z Preparing metadata (setup.py) ... [?25l- done 2025-10-10T01:45:09.1376343Z [?25hBuilding wheels for collected packages: torch_test_cpp_extension 2025-10-10T01:45:09.1392216Z  DEPRECATION: Building 'torch_test_cpp_extension' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torch_test_cpp_extension'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T01:48:52.0053564Z  Building wheel for torch_test_cpp_extension (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ done 2025-10-10T01:48:52.0258539Z [?25h Created wheel for torch_test_cpp_extension: filename=torch_test_cpp_extension-0.0.0-cp310-cp310-linux_x86_64.whl size=13001377 sha256=57dd5edf7f9bd801970f067c4334bdb4c8df4dec1849ceccb0e0f2b741f340b3 2025-10-10T01:48:52.0263052Z Stored in directory: /tmp/pip-ephem-wheel-cache-xv7x6d52/wheels/a9/2e/d7/a9e103243c0b754e2324c4ee6ddd055c388a2eefc520cfc979 2025-10-10T01:48:52.0284039Z Successfully built torch_test_cpp_extension 2025-10-10T01:48:52.3389796Z Installing collected packages: torch_test_cpp_extension 2025-10-10T01:48:52.5336669Z Successfully installed torch_test_cpp_extension-0.0.0 2025-10-10T01:48:55.0110672Z 2025-10-10T01:48:55.0111382Z Running tests... 2025-10-10T01:48:55.0112979Z ---------------------------------------------------------------------- 2025-10-10T01:48:55.2774832Z . 2025-10-10T01:48:55.2775435Z ---------------------------------------------------------------------- 2025-10-10T01:48:55.2776120Z Ran 1 test in 0.267s 2025-10-10T01:48:55.2776404Z 2025-10-10T01:48:55.2776565Z OK 2025-10-10T01:48:55.2776771Z 2025-10-10T01:48:55.2776995Z Generating XML reports... 2025-10-10T01:48:55.8026895Z Running test_reductions 1/1 ... [2025-10-10 01:48:55.802089] 2025-10-10T01:48:55.8027878Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:48:55.8033424Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_reductions.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:48:55.802770] 2025-10-10T01:51:11.4627158Z 2025-10-10T01:51:11.4628563Z test_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_reductions_1.1_36ee833dd751b691_.log 2025-10-10T01:51:11.6238835Z Running 4757 items in this shard: test/test_reductions.py::TestReductionsCUDA::test_accreal_type_cuda, test/test_reductions.py::TestReductionsCUDA::test_all_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_all_any_empty_cuda, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_all_any_vs_numpy_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_all_any_with_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_all_issue117215_cuda, test/test_reductions.py::TestReductionsCUDA::test_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_amin_amax_some_dims_cuda, test/test_reductions.py::TestReductionsCUDA::test_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_aminmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_aminmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_aminmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_argminmax_axis_with_dim_one_cuda, test/test_reductions.py::TestReductionsCUDA::test_argminmax_large_axis_cuda, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_argminmax_multiple_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_bincount_cuda, test/test_reductions.py::TestReductionsCUDA::test_bucketization_cuda, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_cumprod_integer_upcast_cuda, test/test_reductions.py::TestReductionsCUDA::test_cumsum_integer_upcast_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_arg_reduction_scalar_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_default_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_empty_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_duplicate_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsorted_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsupported_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsupported_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsupported_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsupported_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_unsupported_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_multi_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_ndim_limit_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_none_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_offbounds_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_max_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mean_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_median_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_min_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_mode_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_nanmedian_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_norm_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_fns_fn_name_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_lastdim_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_lastdim_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_dim_reduction_less_than_64_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_keepdim_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_dim_single_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_empty_slice_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice__refs_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_any_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_count_nonzero_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_hash_tensor_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_linalg_vector_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_amax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_amin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_argmax_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_argmin_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_logsumexp_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_norm_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_masked_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_nanmean_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_nansum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_prod_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_std_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_std_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_sum_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_empty_tensor_nonempty_slice_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_histc_cuda, test/test_reductions.py::TestReductionsCUDA::test_histc_lowp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_histc_lowp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_corner_cases_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_histc_min_max_errors_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_histc_value_corner_cases_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_histc_value_corner_cases_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_histogram_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_histogram_error_handling_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_histogramdd_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_identity_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_invalid_0dim_aminmax_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_invalid_0dim_aminmax_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_logcumsumexp_complex_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_logcumsumexp_complex_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_logsumexp_integral_promotion_cuda, test/test_reductions.py::TestReductionsCUDA::test_max_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_max_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_max_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_max_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_max_elementwise_cuda, test/test_reductions.py::TestReductionsCUDA::test_max_mixed_devices_cuda, test/test_reductions.py::TestReductionsCUDA::test_max_with_inf_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_max_with_inf_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_max_with_inf_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_max_with_inf_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_mean_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_mean_int_with_optdtype_cuda, test/test_reductions.py::TestReductionsCUDA::test_mean_out_is_alias_of_return_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_mean_out_is_alias_of_return_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_mean_out_is_alias_of_return_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_mean_out_is_alias_of_return_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_median_corner_cases_cuda, test/test_reductions.py::TestReductionsCUDA::test_median_nan_values_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_median_nan_values_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_median_nan_values_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_median_real_values_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_median_real_values_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_median_real_values_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_median_real_values_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_median_real_values_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_min_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_min_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_min_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_min_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_min_elementwise_cuda, test/test_reductions.py::TestReductionsCUDA::test_min_max_nan_cuda, test/test_reductions.py::TestReductionsCUDA::test_min_mixed_devices_cuda, test/test_reductions.py::TestReductionsCUDA::test_min_with_inf_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_min_with_inf_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_min_with_inf_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_min_with_inf_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_minmax_illegal_dtype_cuda, test/test_reductions.py::TestReductionsCUDA::test_mode_boolean_cuda, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_mode_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_mode_large_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_mode_wrong_device_cuda, test/test_reductions.py::TestReductionsCUDA::test_mode_wrong_dtype_cuda, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_omit_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nan_policy_propagate_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_nanmean_integral_types_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_nansum_complex_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_nansum_complex_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_nansum_out_dtype_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_nansum_vs_numpy_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_all_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_expanded_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_innermost_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_outermost_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_noncontiguous_transposed_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_numpy_named_args_cuda, test/test_reductions.py::TestReductionsCUDA::test_prod_bool_cuda, test/test_reductions.py::TestReductionsCUDA::test_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_prod_gpu_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_prod_gpu_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_prod_integer_upcast_cuda, test/test_reductions.py::TestReductionsCUDA::test_prod_lowp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_prod_lowp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_quantile_backward_cuda, test/test_reductions.py::TestReductionsCUDA::test_quantile_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_quantile_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_quantile_error_cuda, test/test_reductions.py::TestReductionsCUDA::test_reduce_dtype_cuda, test/test_reductions.py::TestReductionsCUDA::test_reduction_empty_any_all_cuda, test/test_reductions.py::TestReductionsCUDA::test_reduction_split_cuda, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_input_corner_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_input_corner_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_input_corner_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_input_corner_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_output_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_output_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_output_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reduction_vectorize_along_output_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reductions_large_half_tensors_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reductions_large_half_tensors_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_reductions_large_half_tensors_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_duplicate_values_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_extremal_values_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_1D_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_2D_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_large_input_64bit_indexing_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_scalar_input_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_ref_small_input_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_reference_masked_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_repeated_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype__refs_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_all_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_any_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_count_nonzero_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_hash_tensor_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_linalg_vector_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_amin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmax_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_argmin_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_logsumexp_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_norm_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_norm_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_norm_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_norm_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_std_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_masked_var_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_mean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nanmean_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_nansum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_prod_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_std_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_complex32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_sum_cuda_uint8, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_result_dtype_var_unbiased_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_std_correction_vs_numpy_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_std_correction_vs_numpy_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_std_correction_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_std_correction_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_std_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_std_mean_all_dims_cuda, test/test_reductions.py::TestReductionsCUDA::test_std_mean_correction_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_std_mean_correction_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_std_mean_correction_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_std_mean_correction_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_std_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_std_mean_some_dims_cuda, test/test_reductions.py::TestReductionsCUDA::test_std_vs_numpy_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_std_vs_numpy_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_std_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_std_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_sum_all_cuda_bool, test/test_reductions.py::TestReductionsCUDA::test_sum_all_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_sum_cpu_device_mismatch_cuda, test/test_reductions.py::TestReductionsCUDA::test_sum_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_sum_dim_reduction_uint8_overflow_cuda, test/test_reductions.py::TestReductionsCUDA::test_sum_integer_upcast_cuda, test/test_reductions.py::TestReductionsCUDA::test_sum_noncontig_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_sum_noncontig_lowp_cuda_bfloat16, test/test_reductions.py::TestReductionsCUDA::test_sum_noncontig_lowp_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_sum_out_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_sum_parallel_cuda, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_float16, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_int16, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_int32, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_int64, test/test_reductions.py::TestReductionsCUDA::test_sum_vs_numpy_cuda_int8, test/test_reductions.py::TestReductionsCUDA::test_tensor_compare_ops_argmax_argmix_kthvalue_dim_empty_cuda, test/test_reductions.py::TestReductionsCUDA::test_tensor_compare_ops_empty_cuda, test/test_reductions.py::TestReductionsCUDA::test_tensor_reduce_ops_empty_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_correction_vs_numpy_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_var_correction_vs_numpy_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_var_correction_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_var_correction_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_var_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_dim_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_large_input_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_mean_all_dims_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_mean_correction_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_var_mean_correction_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_var_mean_correction_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_var_mean_correction_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_var_mean_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_mean_some_dims_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_stability2_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_stability_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_unbiased_cuda, test/test_reductions.py::TestReductionsCUDA::test_var_vs_numpy_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_var_vs_numpy_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_var_vs_numpy_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_var_vs_numpy_cuda_float64, test/test_reductions.py::TestReductionsCUDA::test_warn_invalid_degrees_of_freedom_cuda_complex128, test/test_reductions.py::TestReductionsCUDA::test_warn_invalid_degrees_of_freedom_cuda_complex64, test/test_reductions.py::TestReductionsCUDA::test_warn_invalid_degrees_of_freedom_cuda_float32, test/test_reductions.py::TestReductionsCUDA::test_warn_invalid_degrees_of_freedom_cuda_float64 2025-10-10T01:51:11.7637630Z 2025-10-10T01:51:11.7637796Z Running test_fake_tensor 1/1 ... [2025-10-10 01:51:11.475162] 2025-10-10T01:51:11.7638136Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:51:11.7638976Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fake_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:11.475887] 2025-10-10T01:51:32.0926567Z 2025-10-10T01:51:32.0927954Z test_fake_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fake_tensor_1.1_398be7670952018e_.log 2025-10-10T01:51:32.1107809Z Running 286 items in this shard: test/test_fake_tensor.py::FakeTensorTest::test__adaptive_avg_pool2d_backward, test/test_fake_tensor.py::FakeTensorTest::test_alias_call, test/test_fake_tensor.py::FakeTensorTest::test_allow_meta, test/test_fake_tensor.py::FakeTensorTest::test_aten_copy_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_aten_index_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_aten_slice_scatter_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_basic, test/test_fake_tensor.py::FakeTensorTest::test_batch_tensor, test/test_fake_tensor.py::FakeTensorTest::test_binary_op_type_promotion, test/test_fake_tensor.py::FakeTensorTest::test_constructor, test/test_fake_tensor.py::FakeTensorTest::test_conv_nhwc, test/test_fake_tensor.py::FakeTensorTest::test_convert_fake_to_real, test/test_fake_tensor.py::FakeTensorTest::test_cpu_fallback, test/test_fake_tensor.py::FakeTensorTest::test_cuda_initialized, test/test_fake_tensor.py::FakeTensorTest::test_cuda_lstm, test/test_fake_tensor.py::FakeTensorTest::test_cudnn_rnn_with_fallback, test/test_fake_tensor.py::FakeTensorTest::test_cudnn_rnn_without_fallback, test/test_fake_tensor.py::FakeTensorTest::test_custom_op_fallback, test/test_fake_tensor.py::FakeTensorTest::test_data_dependent_operator, test/test_fake_tensor.py::FakeTensorTest::test_deepcopy, test/test_fake_tensor.py::FakeTensorTest::test_device_inplace_copy, test/test_fake_tensor.py::FakeTensorTest::test_embedding_bag_meta, test/test_fake_tensor.py::FakeTensorTest::test_export_numpy, test/test_fake_tensor.py::FakeTensorTest::test_fake_device, test/test_fake_tensor.py::FakeTensorTest::test_fake_dispatch_keys, test/test_fake_tensor.py::FakeTensorTest::test_fake_grad_copy, test/test_fake_tensor.py::FakeTensorTest::test_fake_mode_error, test/test_fake_tensor.py::FakeTensorTest::test_fast_div, test/test_fake_tensor.py::FakeTensorTest::test_from_numpy, test/test_fake_tensor.py::FakeTensorTest::test_fsdp_flat_param, test/test_fake_tensor.py::FakeTensorTest::test_full, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_complex128, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_complex64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float32, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fn, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fnuz, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e5m2, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e5m2fnuz, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int16, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int32, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int8, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_uint8, test/test_fake_tensor.py::FakeTensorTest::test_index_put_error, test/test_fake_tensor.py::FakeTensorTest::test_jagged_fake_to_fake_preserved, test/test_fake_tensor.py::FakeTensorTest::test_like_constructor, test/test_fake_tensor.py::FakeTensorTest::test_mixed_real_and_fake_inputs, test/test_fake_tensor.py::FakeTensorTest::test_mode, test/test_fake_tensor.py::FakeTensorTest::test_nan_to_num, test/test_fake_tensor.py::FakeTensorTest::test_nanmean_out, test/test_fake_tensor.py::FakeTensorTest::test_new, test/test_fake_tensor.py::FakeTensorTest::test_no_tag_func, test/test_fake_tensor.py::FakeTensorTest::test_non_kwarg_device, test/test_fake_tensor.py::FakeTensorTest::test_non_overlapping_stride_zero, test/test_fake_tensor.py::FakeTensorTest::test_non_parameter_grad, test/test_fake_tensor.py::FakeTensorTest::test_normalize_device, test/test_fake_tensor.py::FakeTensorTest::test_op_with_zero_dim_bypassed, test/test_fake_tensor.py::FakeTensorTest::test_out_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_parameter_instantiation, test/test_fake_tensor.py::FakeTensorTest::test_parameter_view, test/test_fake_tensor.py::FakeTensorTest::test_print_in_fake_mode, test/test_fake_tensor.py::FakeTensorTest::test_randperm, test/test_fake_tensor.py::FakeTensorTest::test_recursive_invocation, test/test_fake_tensor.py::FakeTensorTest::test_repr, test/test_fake_tensor.py::FakeTensorTest::test_same_shape_env_preserved, test/test_fake_tensor.py::FakeTensorTest::test_scalar_inputs, test/test_fake_tensor.py::FakeTensorTest::test_scan_reverse_False, test/test_fake_tensor.py::FakeTensorTest::test_scan_reverse_True, test/test_fake_tensor.py::FakeTensorTest::test_setitem, test/test_fake_tensor.py::FakeTensorTest::test_shape_take_not_device, test/test_fake_tensor.py::FakeTensorTest::test_split_return_self, test/test_fake_tensor.py::FakeTensorTest::test_throw, test/test_fake_tensor.py::FakeTensorTest::test_tolist, test/test_fake_tensor.py::FakeTensorTest::test_type_as, test/test_fake_tensor.py::FakeTensorTest::test_unbind_copy_out, test/test_fake_tensor.py::FakeTensorTest::test_unsqueeze_copy, test/test_fake_tensor.py::FakeTensorTest::test_upsample_bilinear_small_channels, test/test_fake_tensor.py::FakeTensorTest::test_zero_dim, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test__adaptive_avg_pool2d_backward_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_alias_call_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_allow_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_copy_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_index_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_slice_scatter_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_basic_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_batch_tensor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_binary_op_type_promotion_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_constructor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_conv_nhwc_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_convert_fake_to_real_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cpu_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cuda_initialized_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cuda_lstm_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cudnn_rnn_with_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cudnn_rnn_without_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_custom_op_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_data_dependent_operator_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_deepcopy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_device_inplace_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_embedding_bag_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_export_numpy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_dispatch_keys_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_grad_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_mode_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fast_div_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_from_numpy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fsdp_flat_param_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_full_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_complex128_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_complex64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float32_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fn_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fnuz_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e5m2_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e5m2fnuz_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int16_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int32_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int8_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_uint8_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_put_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_jagged_fake_to_fake_preserved_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_like_constructor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_mixed_real_and_fake_inputs_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_nan_to_num_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_nanmean_out_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_new_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_no_tag_func_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_kwarg_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_overlapping_stride_zero_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_parameter_grad_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_normalize_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_op_with_zero_dim_bypassed_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_out_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_parameter_instantiation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_parameter_view_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_print_in_fake_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_randperm_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_recursive_invocation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_repr_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_same_shape_env_preserved_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scalar_inputs_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scan_reverse_False_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scan_reverse_True_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_setitem_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_shape_take_not_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_split_return_self_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_throw_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_tolist_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_type_as_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_unbind_copy_out_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_unsqueeze_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_upsample_bilinear_small_channels_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_zero_dim_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorConstHandling::test_aliased_const_write, test/test_fake_tensor.py::FakeTensorConstHandling::test_constant_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_constant_propagate_through_functions, test/test_fake_tensor.py::FakeTensorConstHandling::test_fake_tensor_batch_norm_cpu, test/test_fake_tensor.py::FakeTensorConstHandling::test_fake_tensor_in_intlist_repro, test/test_fake_tensor.py::FakeTensorConstHandling::test_inplace_add, test/test_fake_tensor.py::FakeTensorConstHandling::test_inplace_view_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_shared_storage_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_shared_storages, test/test_fake_tensor.py::FakeTensorConstHandling::test_simple, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_aliased_const_write_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_constant_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_constant_propagate_through_functions_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_fake_tensor_batch_norm_cpu_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_fake_tensor_in_intlist_repro_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_inplace_add_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_inplace_view_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_shared_storage_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_shared_storages_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_simple_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyCatCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyCubeCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyMulCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyMulScalarCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyNMSCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyNonzeroCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySortCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySplitCopyCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySplitCopyWithIntCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyTakeCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyViewCopyCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorConverterTest::test_dead_key, test/test_fake_tensor.py::FakeTensorConverterTest::test_dead_weak_ref, test/test_fake_tensor.py::FakeTensorConverterTest::test_memoized_conversion_from_meta, test/test_fake_tensor.py::FakeTensorConverterTest::test_memoized_conversion_to_meta, test/test_fake_tensor.py::FakeTensorConverterTest::test_multiple_modes, test/test_fake_tensor.py::FakeTensorConverterTest::test_no_active_mode, test/test_fake_tensor.py::FakeTensorConverterTest::test_no_ref_cycle, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_mode_error, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_tensor_storages_non_view, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_tensor_storages_view, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_dead_key_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_dead_weak_ref_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_memoized_conversion_from_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_memoized_conversion_to_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_multiple_modes_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_no_active_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_no_ref_cycle_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_mode_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_tensor_storages_non_view_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_tensor_storages_view_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_conv_c1_backward, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_cross_entropy_loss, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_embedding_bag_private, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_fake_gpu_no_init, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_flash_attention, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_like_ops, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_module_to, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_move_meta_tensor, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_move_module_under_fake, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_no_dispatch_with_like_function, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_non_kwarg_only_device, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_sparse_new, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_str_storage, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_tensor_constructors_all_have_kwarg_device, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_tensor_new, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_conv_c1_backward_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_cross_entropy_loss_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_embedding_bag_private_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_fake_gpu_no_init_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_flash_attention_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_like_ops_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_module_to_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_move_meta_tensor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_move_module_under_fake_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_no_dispatch_with_like_function_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_non_kwarg_only_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_sparse_new_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_str_storage_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_tensor_constructors_all_have_kwarg_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_tensor_new_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorPropTest::test_fake_tensor_prop_on_nn_module, test/test_fake_tensor.py::FakeTensorPropTest::test_fake_tensor_prop_on_nn_module_with_optional_args, test/test_fake_tensor.py::FakeTensorPropTest::test_nan_to_num, test/test_fake_tensor.py::FakeTensorPropTest::test_nonzero_stride, test/test_fake_tensor.py::FakeTensorPropTest::test_torch_load_with_fake_mode, test/test_fake_tensor.py::FakeTensorPropTest::test_unbacked_shape_realloc, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_fake_tensor_prop_on_nn_module_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_fake_tensor_prop_on_nn_module_with_optional_args_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_nan_to_num_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_nonzero_stride_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_torch_load_with_fake_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_unbacked_shape_realloc_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorSerialization::test_serialization, test/test_fake_tensor.py::FakeTensorSerialization::test_serialization_with_tracing, test/test_fake_tensor.py::FakeTensorDispatchCache::test__upsample_bilinear2d_aa_backward_dynamic_shapes, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_aten_index, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_bypass, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_default_device, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_default_dtype, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_dispatch_key_set, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_hit, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_inplace_op, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_constants, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_device, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_dtype, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_conj, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_inference, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_neg, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_memory_format, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_requires_grad, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_shape, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_storage_offset, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_stride, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_tuple_outputs, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_view_op, test/test_fake_tensor.py::FakeTensorDispatchCache::test_empty_list, test/test_fake_tensor.py::FakeTensorDispatchCache::test_fft_hfft2_issue145522, test/test_fake_tensor.py::FakeTensorDispatchCache::test_from_buffer, test/test_fake_tensor.py::FakeTensorDispatchCache::test_inference_mode, test/test_fake_tensor.py::FakeTensorDispatchCache::test_invoke_subgraph, test/test_fake_tensor.py::FakeTensorDispatchCache::test_invoke_subgraph_cacheable_inplace, test/test_fake_tensor.py::FakeTensorDispatchCache::test_meta_tensor_to_fake_cpu, test/test_fake_tensor.py::FakeTensorDispatchCache::test_shape_env_settings, test/test_fake_tensor.py::FakeTensorDispatchCache::test_unbacked_output, test/test_fake_tensor.py::FakeTensorDispatchCache::test_wrapper_tensor_subclass_different_device, test/test_fake_tensor.py::FakeTensorPreferDeviceType::test_fake_tensor_prefer_device_type, test/test_fake_tensor.py::FakeTensorPreferDeviceType::test_fake_tensor_prefer_device_type_cpu_only 2025-10-10T01:51:32.1264299Z 2025-10-10T01:51:32.1264548Z Running test_nn 1/1 ... [2025-10-10 01:51:32.093057] 2025-10-10T01:51:32.1265142Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:51:32.1266663Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nn.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:32.093690] 2025-10-10T01:58:51.8189172Z 2025-10-10T01:58:51.8190395Z test_nn 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nn_1.1_aa62ebb1591c5a82_.log 2025-10-10T01:58:51.9436640Z Running 2293 items in this shard: test/test_nn.py::TestNN::test_AdaptiveLogSoftmax, test/test_nn.py::TestNN::test_AdaptiveLogSoftmax_cuda, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_reduce, test/test_nn.py::TestNN::test_BCELoss_no_reduce_cuda, test/test_nn.py::TestNN::test_BCELoss_no_reduce_scalar, test/test_nn.py::TestNN::test_BCELoss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce_scalar, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_legacy_enum, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_legacy_enum_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce_scalar, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_CELU_no_batch_dim, test/test_nn.py::TestNN::test_CELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_CTCLoss_critical_target_len, test/test_nn.py::TestNN::test_CTCLoss_lengthchecks_cpu, test/test_nn.py::TestNN::test_CTCLoss_lengthchecks_cuda, test/test_nn.py::TestNN::test_CTCLoss_long_targets, test/test_nn.py::TestNN::test_CTCLoss_typechecks, test/test_nn.py::TestNN::test_CTCLoss_zero_infinity, test/test_nn.py::TestNN::test_CTCLoss_zero_lengths, test/test_nn.py::TestNN::test_Conv1d, test/test_nn.py::TestNN::test_Conv1d_circular_stride2_pad2, test/test_nn.py::TestNN::test_Conv1d_circular_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv1d_cuda, test/test_nn.py::TestNN::test_Conv1d_dilated, test/test_nn.py::TestNN::test_Conv1d_dilated_cuda, test/test_nn.py::TestNN::test_Conv1d_groups, test/test_nn.py::TestNN::test_Conv1d_groups_cuda, test/test_nn.py::TestNN::test_Conv1d_pad1, test/test_nn.py::TestNN::test_Conv1d_pad1_cuda, test/test_nn.py::TestNN::test_Conv1d_pad1size1, test/test_nn.py::TestNN::test_Conv1d_pad1size1_cuda, test/test_nn.py::TestNN::test_Conv1d_pad2, test/test_nn.py::TestNN::test_Conv1d_pad2_cuda, test/test_nn.py::TestNN::test_Conv1d_pad2size1, test/test_nn.py::TestNN::test_Conv1d_pad2size1_cuda, test/test_nn.py::TestNN::test_Conv1d_pad_same, test/test_nn.py::TestNN::test_Conv1d_pad_same2, test/test_nn.py::TestNN::test_Conv1d_pad_same2_cuda, test/test_nn.py::TestNN::test_Conv1d_pad_same_cuda, test/test_nn.py::TestNN::test_Conv1d_pad_same_dilated, test/test_nn.py::TestNN::test_Conv1d_pad_same_dilated_cuda, test/test_nn.py::TestNN::test_Conv1d_pad_valid, test/test_nn.py::TestNN::test_Conv1d_pad_valid_cuda, test/test_nn.py::TestNN::test_Conv1d_reflect_stride2_pad2, test/test_nn.py::TestNN::test_Conv1d_reflect_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv1d_replicate_stride2_pad2, test/test_nn.py::TestNN::test_Conv1d_replicate_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv1d_stride, test/test_nn.py::TestNN::test_Conv1d_stride_cuda, test/test_nn.py::TestNN::test_Conv1d_zero_batch, test/test_nn.py::TestNN::test_Conv1d_zero_batch_cuda, test/test_nn.py::TestNN::test_Conv1d_zeros_stride2_pad2, test/test_nn.py::TestNN::test_Conv1d_zeros_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv2d, test/test_nn.py::TestNN::test_Conv2d_circular_stride2_pad2, test/test_nn.py::TestNN::test_Conv2d_circular_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv2d_cuda, test/test_nn.py::TestNN::test_Conv2d_depthwise, test/test_nn.py::TestNN::test_Conv2d_depthwise_cuda, test/test_nn.py::TestNN::test_Conv2d_depthwise_dilated, test/test_nn.py::TestNN::test_Conv2d_depthwise_dilated_cuda, test/test_nn.py::TestNN::test_Conv2d_depthwise_padded, test/test_nn.py::TestNN::test_Conv2d_depthwise_padded_cuda, test/test_nn.py::TestNN::test_Conv2d_depthwise_strided, test/test_nn.py::TestNN::test_Conv2d_depthwise_strided_cuda, test/test_nn.py::TestNN::test_Conv2d_depthwise_with_multiplier, test/test_nn.py::TestNN::test_Conv2d_depthwise_with_multiplier_cuda, test/test_nn.py::TestNN::test_Conv2d_dilated, test/test_nn.py::TestNN::test_Conv2d_dilated_cuda, test/test_nn.py::TestNN::test_Conv2d_dilated_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_dilated_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_groups, test/test_nn.py::TestNN::test_Conv2d_groups_cuda, test/test_nn.py::TestNN::test_Conv2d_groups_thnn, test/test_nn.py::TestNN::test_Conv2d_groups_thnn_cuda, test/test_nn.py::TestNN::test_Conv2d_groups_thnn_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_groups_thnn_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_groups_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_groups_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_no_bias, test/test_nn.py::TestNN::test_Conv2d_no_bias_cuda, test/test_nn.py::TestNN::test_Conv2d_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_no_bias_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_pad_same, test/test_nn.py::TestNN::test_Conv2d_pad_same_cuda, test/test_nn.py::TestNN::test_Conv2d_pad_same_dilated, test/test_nn.py::TestNN::test_Conv2d_pad_same_dilated_cuda, test/test_nn.py::TestNN::test_Conv2d_pad_valid, test/test_nn.py::TestNN::test_Conv2d_pad_valid_cuda, test/test_nn.py::TestNN::test_Conv2d_padding, test/test_nn.py::TestNN::test_Conv2d_padding_cuda, test/test_nn.py::TestNN::test_Conv2d_padding_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_padding_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_reflect_stride2_pad2, test/test_nn.py::TestNN::test_Conv2d_reflect_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv2d_replicate_stride2_pad2, test/test_nn.py::TestNN::test_Conv2d_replicate_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv2d_strided, test/test_nn.py::TestNN::test_Conv2d_strided_cuda, test/test_nn.py::TestNN::test_Conv2d_strided_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_strided_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_zero_batch, test/test_nn.py::TestNN::test_Conv2d_zero_batch_cuda, test/test_nn.py::TestNN::test_Conv2d_zero_batch_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_zero_batch_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv2d_zeros_stride2_pad2, test/test_nn.py::TestNN::test_Conv2d_zeros_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv3d, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias_cuda, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_circular_stride2_pad2, test/test_nn.py::TestNN::test_Conv3d_circular_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv3d_cuda, test/test_nn.py::TestNN::test_Conv3d_dilated, test/test_nn.py::TestNN::test_Conv3d_dilated_cuda, test/test_nn.py::TestNN::test_Conv3d_dilated_strided, test/test_nn.py::TestNN::test_Conv3d_dilated_strided_cuda, test/test_nn.py::TestNN::test_Conv3d_groups, test/test_nn.py::TestNN::test_Conv3d_groups_cuda, test/test_nn.py::TestNN::test_Conv3d_groups_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_groups_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_no_bias, test/test_nn.py::TestNN::test_Conv3d_no_bias_cuda, test/test_nn.py::TestNN::test_Conv3d_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_no_bias_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_pad_same, test/test_nn.py::TestNN::test_Conv3d_pad_same_cuda, test/test_nn.py::TestNN::test_Conv3d_pad_same_dilated, test/test_nn.py::TestNN::test_Conv3d_pad_same_dilated_cuda, test/test_nn.py::TestNN::test_Conv3d_pad_valid, test/test_nn.py::TestNN::test_Conv3d_pad_valid_cuda, test/test_nn.py::TestNN::test_Conv3d_replicate_stride2_pad2, test/test_nn.py::TestNN::test_Conv3d_replicate_stride2_pad2_cuda, test/test_nn.py::TestNN::test_Conv3d_stride, test/test_nn.py::TestNN::test_Conv3d_stride_cuda, test/test_nn.py::TestNN::test_Conv3d_stride_padding, test/test_nn.py::TestNN::test_Conv3d_stride_padding_cuda, test/test_nn.py::TestNN::test_Conv3d_stride_padding_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_stride_padding_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_stride_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_stride_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_zero_batch, test/test_nn.py::TestNN::test_Conv3d_zero_batch_cuda, test/test_nn.py::TestNN::test_Conv3d_zero_batch_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_zero_batch_with_long_tensor_cuda, test/test_nn.py::TestNN::test_Conv3d_zeros_stride2_pad2, test/test_nn.py::TestNN::test_Conv3d_zeros_stride2_pad2_cuda, test/test_nn.py::TestNN::test_ConvTranspose1d, test/test_nn.py::TestNN::test_ConvTranspose1d_cuda, test/test_nn.py::TestNN::test_ConvTranspose1d_dilated, test/test_nn.py::TestNN::test_ConvTranspose1d_dilated_cuda, test/test_nn.py::TestNN::test_ConvTranspose1d_groups, test/test_nn.py::TestNN::test_ConvTranspose1d_groups_cuda, test/test_nn.py::TestNN::test_ConvTranspose1d_no_bias, test/test_nn.py::TestNN::test_ConvTranspose1d_no_bias_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d, test/test_nn.py::TestNN::test_ConvTranspose2d_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_with_long_tensor_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_groups, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_with_long_tensor_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias_with_long_tensor_cuda, test/test_nn.py::TestNN::test_ConvTranspose2d_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_with_long_tensor_cuda, test/test_nn.py::TestNN::test_ConvTranspose3d, test/test_nn.py::TestNN::test_ConvTranspose3d_cuda, test/test_nn.py::TestNN::test_ConvTranspose3d_dilated, test/test_nn.py::TestNN::test_ConvTranspose3d_dilated_cuda, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_CrossMapLRN2d, test/test_nn.py::TestNN::test_CrossMapLRN2d_cuda, test/test_nn.py::TestNN::test_ELU_no_batch_dim, test/test_nn.py::TestNN::test_ELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Embedding, test/test_nn.py::TestNN::test_EmbeddingBag_discontiguous, test/test_nn.py::TestNN::test_EmbeddingBag_discontiguous_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_max, test/test_nn.py::TestNN::test_EmbeddingBag_max_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_max_padding_idx, test/test_nn.py::TestNN::test_EmbeddingBag_max_padding_idx_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_mean, test/test_nn.py::TestNN::test_EmbeddingBag_mean_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_mean_padding_idx, test/test_nn.py::TestNN::test_EmbeddingBag_mean_padding_idx_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_sparse, test/test_nn.py::TestNN::test_EmbeddingBag_sparse_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_sum, test/test_nn.py::TestNN::test_EmbeddingBag_sum_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_sum_padding_idx, test/test_nn.py::TestNN::test_EmbeddingBag_sum_padding_idx_cuda, test/test_nn.py::TestNN::test_Embedding_cuda, test/test_nn.py::TestNN::test_Embedding_discontiguous, test/test_nn.py::TestNN::test_Embedding_discontiguous_cuda, test/test_nn.py::TestNN::test_Embedding_sparse, test/test_nn.py::TestNN::test_Embedding_sparse_cuda, test/test_nn.py::TestNN::test_Flatten, test/test_nn.py::TestNN::test_Flatten_cuda, test/test_nn.py::TestNN::test_Flatten_no_batch_dim, test/test_nn.py::TestNN::test_Flatten_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Fold, test/test_nn.py::TestNN::test_Fold_cuda, test/test_nn.py::TestNN::test_Fold_int_input, test/test_nn.py::TestNN::test_Fold_int_input_cuda, test/test_nn.py::TestNN::test_Fold_no_batch_dim_input, test/test_nn.py::TestNN::test_Fold_no_batch_dim_input_cuda, test/test_nn.py::TestNN::test_Fold_no_batch_dim_int_input, test/test_nn.py::TestNN::test_Fold_no_batch_dim_int_input_cuda, test/test_nn.py::TestNN::test_GELU_no_batch_dim, test/test_nn.py::TestNN::test_GELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_GLU_no_batch_dim, test/test_nn.py::TestNN::test_GLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardshrink_no_batch_dim, test/test_nn.py::TestNN::test_Hardshrink_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardsigmoid_no_batch_dim, test/test_nn.py::TestNN::test_Hardsigmoid_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardswish_no_batch_dim, test/test_nn.py::TestNN::test_Hardswish_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardtanh_no_batch_dim, test/test_nn.py::TestNN::test_Hardtanh_no_batch_dim_cuda, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_margin_no_reduce, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_margin_no_reduce_cuda, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_reduce, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_HuberLoss_delta, test/test_nn.py::TestNN::test_HuberLoss_delta_cuda, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_KLDivLoss_batch_mean, test/test_nn.py::TestNN::test_KLDivLoss_batch_mean_log_target, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_log_target, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_log_target_cuda, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar_log_target, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar_log_target_cuda, test/test_nn.py::TestNN::test_KLDivLoss_with_log_target_no_reduce, test/test_nn.py::TestNN::test_KLDivLoss_with_log_target_no_reduce_cuda, test/test_nn.py::TestNN::test_KLDivLoss_with_target_no_reduce, test/test_nn.py::TestNN::test_KLDivLoss_with_target_no_reduce_cuda, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_mean, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_L1Loss_no_reduce, test/test_nn.py::TestNN::test_L1Loss_no_reduce_complex, test/test_nn.py::TestNN::test_L1Loss_no_reduce_complex_cuda, test/test_nn.py::TestNN::test_L1Loss_no_reduce_cuda, test/test_nn.py::TestNN::test_L1Loss_no_reduce_scalar, test/test_nn.py::TestNN::test_L1Loss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_LSTM_cell, test/test_nn.py::TestNN::test_LSTM_cell_forward_hidden_size, test/test_nn.py::TestNN::test_LSTM_cell_forward_input_size, test/test_nn.py::TestNN::test_LayerNorm_3d_no_affine_large_feature, test/test_nn.py::TestNN::test_LayerNorm_3d_no_affine_large_feature_cuda, test/test_nn.py::TestNN::test_LayerNorm_3d_no_affine_large_feature_eval, test/test_nn.py::TestNN::test_LayerNorm_3d_no_affine_large_feature_eval_cuda, test/test_nn.py::TestNN::test_LeakyReLU_no_batch_dim, test/test_nn.py::TestNN::test_LeakyReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Linear, test/test_nn.py::TestNN::test_Linear_cuda, test/test_nn.py::TestNN::test_Linear_no_batch_dim, test/test_nn.py::TestNN::test_Linear_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Linear_no_bias, test/test_nn.py::TestNN::test_Linear_no_bias_cuda, test/test_nn.py::TestNN::test_LogSigmoid_no_batch_dim, test/test_nn.py::TestNN::test_LogSigmoid_no_batch_dim_cuda, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MSELoss_no_reduce, test/test_nn.py::TestNN::test_MSELoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MSELoss_no_reduce_scalar, test/test_nn.py::TestNN::test_MSELoss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MaxUnpool1d_net, test/test_nn.py::TestNN::test_MaxUnpool1d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool1d_net_no_batch_dim, test/test_nn.py::TestNN::test_MaxUnpool1d_net_no_batch_dim_cuda, test/test_nn.py::TestNN::test_MaxUnpool2d_net, test/test_nn.py::TestNN::test_MaxUnpool2d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool2d_net_no_batch_dim, test/test_nn.py::TestNN::test_MaxUnpool2d_net_no_batch_dim_cuda, test/test_nn.py::TestNN::test_MaxUnpool3d_net, test/test_nn.py::TestNN::test_MaxUnpool3d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool3d_net_no_batch_dim, test/test_nn.py::TestNN::test_MaxUnpool3d_net_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Mish_no_batch_dim, test/test_nn.py::TestNN::test_Mish_no_batch_dim_cuda, test/test_nn.py::TestNN::test_ModuleDict, test/test_nn.py::TestNN::test_ModuleList, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_0d_no_reduce, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_0d_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_1d_no_reduce, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_1d_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_index_neg, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_index_neg_cuda, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_reduce, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_reduce, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_weights_no_reduce, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_1d_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_1d_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_margin_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_margin_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_p_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_p_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_weights_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_ignore_index, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_ignore_index_cuda, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_weights, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_weights_cuda, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_ignore_index, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_ignore_index_cuda, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_weights, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_weights_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_NLLLoss_no_reduce, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_ignore_index, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_ignore_index_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index_neg, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index_neg_cuda, test/test_nn.py::TestNN::test_PReLU_backward_requires_grad_false, test/test_nn.py::TestNN::test_PReLU_no_batch_dim, test/test_nn.py::TestNN::test_PReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_PairwiseDistance, test/test_nn.py::TestNN::test_PairwiseDistance_broadcast_lhs, test/test_nn.py::TestNN::test_PairwiseDistance_broadcast_lhs_cuda, test/test_nn.py::TestNN::test_PairwiseDistance_broadcast_rhs, test/test_nn.py::TestNN::test_PairwiseDistance_broadcast_rhs_cuda, test/test_nn.py::TestNN::test_PairwiseDistance_cuda, test/test_nn.py::TestNN::test_PairwiseDistance_no_batch_dim, test/test_nn.py::TestNN::test_PairwiseDistance_no_batch_dim_cuda, test/test_nn.py::TestNN::test_PairwiseDistance_with_non_default_args, test/test_nn.py::TestNN::test_PairwiseDistance_with_non_default_args_cuda, test/test_nn.py::TestNN::test_ParameterDict, test/test_nn.py::TestNN::test_ParameterDict_replication, test/test_nn.py::TestNN::test_ParameterList, test/test_nn.py::TestNN::test_ParameterList_meta, test/test_nn.py::TestNN::test_ParameterList_replication, test/test_nn.py::TestNN::test_PixelShuffle, test/test_nn.py::TestNN::test_PixelShuffle_cuda, test/test_nn.py::TestNN::test_PixelUnshuffle, test/test_nn.py::TestNN::test_PixelUnshuffle_cuda, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_reduce, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_RNN_cell, test/test_nn.py::TestNN::test_RNN_cell_forward_zero_hidden_size, test/test_nn.py::TestNN::test_RNN_cell_no_broadcasting, test/test_nn.py::TestNN::test_RNN_change_dropout, test/test_nn.py::TestNN::test_RNN_cpu_vs_cudnn_no_dropout, test/test_nn.py::TestNN::test_RNN_cpu_vs_cudnn_with_dropout, test/test_nn.py::TestNN::test_RNN_cudnn_weight_norm, test/test_nn.py::TestNN::test_RNN_dropout, test/test_nn.py::TestNN::test_RNN_dropout_state, test/test_nn.py::TestNN::test_RNN_input_size_zero, test/test_nn.py::TestNN::test_RNN_nonlinearity, test/test_nn.py::TestNN::test_RNN_nonlinearity_passed_as_arg, test/test_nn.py::TestNN::test_RReLU, test/test_nn.py::TestNN::test_RReLU_cuda, test/test_nn.py::TestNN::test_RReLU_no_batch_dim, test/test_nn.py::TestNN::test_RReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_RReLU_with_up_down, test/test_nn.py::TestNN::test_RReLU_with_up_down_cuda, test/test_nn.py::TestNN::test_RReLU_with_up_down_scalar, test/test_nn.py::TestNN::test_RReLU_with_up_down_scalar_cuda, test/test_nn.py::TestNN::test_ReLU6_no_batch_dim, test/test_nn.py::TestNN::test_ReLU6_no_batch_dim_cuda, test/test_nn.py::TestNN::test_ReLU_no_batch_dim, test/test_nn.py::TestNN::test_ReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d, test/test_nn.py::TestNN::test_ReplicationPad3d_complex, test/test_nn.py::TestNN::test_ReplicationPad3d_complex_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d_no_batch_dim, test/test_nn.py::TestNN::test_ReplicationPad3d_no_batch_dim_cuda, test/test_nn.py::TestNN::test_SELU_no_batch_dim, test/test_nn.py::TestNN::test_SELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Sequential_add, test/test_nn.py::TestNN::test_Sequential_append, test/test_nn.py::TestNN::test_Sequential_delitem, test/test_nn.py::TestNN::test_Sequential_extend, test/test_nn.py::TestNN::test_Sequential_getitem, test/test_nn.py::TestNN::test_Sequential_iadd, test/test_nn.py::TestNN::test_Sequential_imul, test/test_nn.py::TestNN::test_Sequential_insert, test/test_nn.py::TestNN::test_Sequential_insert_fail_case, test/test_nn.py::TestNN::test_Sequential_mul, test/test_nn.py::TestNN::test_Sequential_pop, test/test_nn.py::TestNN::test_Sequential_rmul, test/test_nn.py::TestNN::test_Sequential_setitem, test/test_nn.py::TestNN::test_Sequential_setitem_named, test/test_nn.py::TestNN::test_SiLU_no_batch_dim, test/test_nn.py::TestNN::test_SiLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Sigmoid_no_batch_dim, test/test_nn.py::TestNN::test_Sigmoid_no_batch_dim_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_beta, test/test_nn.py::TestNN::test_SmoothL1Loss_beta_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_none, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_sum, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce_scalar, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_zero_beta, test/test_nn.py::TestNN::test_SmoothL1Loss_zero_beta_cuda, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_SoftMarginLoss_no_reduce, test/test_nn.py::TestNN::test_SoftMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_Softplus_no_batch_dim, test/test_nn.py::TestNN::test_Softplus_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Softshrink_no_batch_dim, test/test_nn.py::TestNN::test_Softshrink_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Softsign_no_batch_dim, test/test_nn.py::TestNN::test_Softsign_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Tanh_no_batch_dim, test/test_nn.py::TestNN::test_Tanh_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Tanhshrink_no_batch_dim, test/test_nn.py::TestNN::test_Tanhshrink_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Threshold_no_batch_dim, test/test_nn.py::TestNN::test_Threshold_no_batch_dim_cuda, test/test_nn.py::TestNN::test_TransformerDecoderLayer_gelu_activation, test/test_nn.py::TestNN::test_TransformerDecoderLayer_gelu_activation_cuda, test/test_nn.py::TestNN::test_TransformerDecoderLayer_relu_activation, test/test_nn.py::TestNN::test_TransformerDecoderLayer_relu_activation_cuda, test/test_nn.py::TestNN::test_TransformerEncoderLayer_gelu_activation, test/test_nn.py::TestNN::test_TransformerEncoderLayer_gelu_activation_cuda, test/test_nn.py::TestNN::test_TransformerEncoderLayer_relu_activation, test/test_nn.py::TestNN::test_TransformerEncoderLayer_relu_activation_cuda, test/test_nn.py::TestNN::test_Transformer_cell, test/test_nn.py::TestNN::test_Transformer_multilayer_coder, test/test_nn.py::TestNN::test_Transformer_multilayer_coder_cuda, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean_cuda_float, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none_cuda_float, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum_cuda_float, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_Unflatten_no_batch_dim, test/test_nn.py::TestNN::test_Unflatten_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Unfold, test/test_nn.py::TestNN::test_Unfold_cuda, test/test_nn.py::TestNN::test_Unfold_int_input, test/test_nn.py::TestNN::test_Unfold_int_input_cuda, test/test_nn.py::TestNN::test_adaptive_log_softmax, test/test_nn.py::TestNN::test_add_module, test/test_nn.py::TestNN::test_add_module_raises_error_if_attr_exists, test/test_nn.py::TestNN::test_affine_grid, test/test_nn.py::TestNN::test_affine_grid_3d, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cpu_nd_2, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cpu_nd_3, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cuda_nd_2, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cuda_nd_3, test/test_nn.py::TestNN::test_affine_grid_error_checking, test/test_nn.py::TestNN::test_assignment, test/test_nn.py::TestNN::test_batch_norm_update_stats, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_buffer_update_when_stats_are_not_tracked, test/test_nn.py::TestNN::test_batchnorm_cudnn_half, test/test_nn.py::TestNN::test_batchnorm_cudnn_nhwc, test/test_nn.py::TestNN::test_batchnorm_half_overflow, test/test_nn.py::TestNN::test_batchnorm_load_state_dict, test/test_nn.py::TestNN::test_batchnorm_nhwc_cpu, test/test_nn.py::TestNN::test_batchnorm_nhwc_cuda, test/test_nn.py::TestNN::test_batchnorm_non_contig_cpu_BatchNorm2d, test/test_nn.py::TestNN::test_batchnorm_non_contig_cpu_SyncBatchNorm, test/test_nn.py::TestNN::test_batchnorm_nonaffine_cuda_half_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_bias_is_not_same_size_as_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_less_than_one_value_per_channel, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_running_mean_is_not_same_size_as_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_running_var_is_not_same_size_as_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_running_var_or_running_mean_have_forward_grad, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_weight_is_not_same_size_as_input, test/test_nn.py::TestNN::test_bce_loss_always_nonnegative, test/test_nn.py::TestNN::test_bce_loss_broadcasts_weights, test/test_nn.py::TestNN::test_bce_loss_input_range, test/test_nn.py::TestNN::test_bce_loss_size_mismatch, test/test_nn.py::TestNN::test_bce_with_logits_broadcasts_pos_weights, test/test_nn.py::TestNN::test_bce_with_logits_broadcasts_weights, test/test_nn.py::TestNN::test_bce_with_logits_gives_same_result_as_sigmoid_and_bce_loss, test/test_nn.py::TestNN::test_bce_with_logits_gives_same_result_as_sigmoid_and_bce_loss_large_tensors_with_grad, test/test_nn.py::TestNN::test_bce_with_logits_has_correct_forward_grad, test/test_nn.py::TestNN::test_bce_with_logits_has_correct_grad_at_zero, test/test_nn.py::TestNN::test_bce_with_logits_ones_in_pos_weights_are_the_same_as_none, test/test_nn.py::TestNN::test_bce_with_logits_raises_if_target_and_input_are_different_size, test/test_nn.py::TestNN::test_bce_with_logits_stability, test/test_nn.py::TestNN::test_bce_with_logits_with_pos_weight_has_correct_grad_at_zero, test/test_nn.py::TestNN::test_bilinear, test/test_nn.py::TestNN::test_bilinear_broadcasting, test/test_nn.py::TestNN::test_bilinear_no_bias, test/test_nn.py::TestNN::test_bilinear_non_contiguous, test/test_nn.py::TestNN::test_bilinear_value_error, test/test_nn.py::TestNN::test_broadcast_double_backwards_gpu, test/test_nn.py::TestNN::test_broadcast_no_grad, test/test_nn.py::TestNN::test_broadcast_not_requiring_grad, test/test_nn.py::TestNN::test_buffer_bad_module_subclass, test/test_nn.py::TestNN::test_buffer_not_persistent, test/test_nn.py::TestNN::test_buffer_not_persistent_assign, test/test_nn.py::TestNN::test_buffer_not_persistent_del, test/test_nn.py::TestNN::test_buffer_not_persistent_load, test/test_nn.py::TestNN::test_buffer_not_persistent_overwrite, test/test_nn.py::TestNN::test_buffers_and_named_buffers, test/test_nn.py::TestNN::test_call_supports_python_dict_output, test/test_nn.py::TestNN::test_channel_shuffle_input_checks, test/test_nn.py::TestNN::test_channel_shuffle_return_alias_of_self, test/test_nn.py::TestNN::test_children, test/test_nn.py::TestNN::test_container_copy, test/test_nn.py::TestNN::test_convert_sync_batchnorm, test/test_nn.py::TestNN::test_cosine_embedding_loss_error_on_diff_shapes, test/test_nn.py::TestNN::test_cosine_embedding_loss_error_on_nonexpandable_shapes, test/test_nn.py::TestNN::test_cosine_embedding_loss_invalid_shape, test/test_nn.py::TestNN::test_cosine_embedding_loss_margin_no_reduce, test/test_nn.py::TestNN::test_cosine_embedding_loss_no_reduce, test/test_nn.py::TestNN::test_cosine_embedding_loss_with_diff_type, test/test_nn.py::TestNN::test_cosine_similarity, test/test_nn.py::TestNN::test_cross_entropy_loss, test/test_nn.py::TestNN::test_cross_entropy_loss_precision, test/test_nn.py::TestNN::test_cross_entropy_loss_zero_div, test/test_nn.py::TestNN::test_cudnn_forward_exception, test/test_nn.py::TestNN::test_cudnn_rnn_dropout_states_device, test/test_nn.py::TestNN::test_cudnn_weight_format, test/test_nn.py::TestNN::test_cudnn_weight_tying, test/test_nn.py::TestNN::test_dir, test/test_nn.py::TestNN::test_dir_digit, test/test_nn.py::TestNN::test_elu_inplace_gradgrad, test/test_nn.py::TestNN::test_elu_inplace_on_view, test/test_nn.py::TestNN::test_error_RNN_seq_len_zero, test/test_nn.py::TestNN::test_extra_state, test/test_nn.py::TestNN::test_extra_state_missing_get_extra_state, test/test_nn.py::TestNN::test_extra_state_missing_set_extra_state, test/test_nn.py::TestNN::test_extra_state_non_dict, test/test_nn.py::TestNN::test_fb_fc_packed, test/test_nn.py::TestNN::test_flatten, test/test_nn.py::TestNN::test_fold_invalid_arg, test/test_nn.py::TestNN::test_fractional_max_pool2d_invalid_output_ratio, test/test_nn.py::TestNN::test_gaussian_nll_loss_args, test/test_nn.py::TestNN::test_gaussian_nll_loss_broadcasting, test/test_nn.py::TestNN::test_gaussian_nll_loss_scalar_var, test/test_nn.py::TestNN::test_get_buffer, test/test_nn.py::TestNN::test_get_buffer_from_submodules, test/test_nn.py::TestNN::test_getattr_with_property, test/test_nn.py::TestNN::test_grid_sample, test/test_nn.py::TestNN::test_grid_sample_3d, test/test_nn.py::TestNN::test_grid_sample_error_checking, test/test_nn.py::TestNN::test_grid_sample_nearest_neighbor_rounding_mode_consistency, test/test_nn.py::TestNN::test_hardtanh_backward, test/test_nn.py::TestNN::test_hardtanh_inplace_gradgrad, test/test_nn.py::TestNN::test_huber_loss_invalid_delta, test/test_nn.py::TestNN::test_inplace_thnn, test/test_nn.py::TestNN::test_interpolate, test/test_nn.py::TestNN::test_interpolate_bicubic_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_bicubic_2d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_shared_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_shared_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_bilinear_2d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_shared_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_shared_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_tuple_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_tuple_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bilinear_tuple_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_tuple_2d_cuda, test/test_nn.py::TestNN::test_interpolate_buffer_overflow, test/test_nn.py::TestNN::test_interpolate_illegal_memory_access, test/test_nn.py::TestNN::test_interpolate_linear_1d, test/test_nn.py::TestNN::test_interpolate_linear_1d_align_corners, test/test_nn.py::TestNN::test_interpolate_linear_1d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_linear_1d_cuda, test/test_nn.py::TestNN::test_interpolate_linear_1d_zero_dim, test/test_nn.py::TestNN::test_interpolate_linear_1d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d_align_corners, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d_cuda, test/test_nn.py::TestNN::test_interpolate_linear_tuple_1d, test/test_nn.py::TestNN::test_interpolate_linear_tuple_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_1d, test/test_nn.py::TestNN::test_interpolate_nearest_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_1d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_1d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_2d, test/test_nn.py::TestNN::test_interpolate_nearest_2d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_2d_launch_configs, test/test_nn.py::TestNN::test_interpolate_nearest_2d_launch_configs_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_2d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_3d, test/test_nn.py::TestNN::test_interpolate_nearest_3d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_3d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_3d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_1d, test/test_nn.py::TestNN::test_interpolate_nearest_scale_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_2d, test/test_nn.py::TestNN::test_interpolate_nearest_scale_2d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_3d, test/test_nn.py::TestNN::test_interpolate_nearest_scale_3d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_1d, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_2d, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_2d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_3d, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_3d_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_3d, test/test_nn.py::TestNN::test_interpolate_trilinear_3d_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_3d_zero_dim, test/test_nn.py::TestNN::test_interpolate_trilinear_3d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_scale_3d, test/test_nn.py::TestNN::test_interpolate_trilinear_scale_3d_align_corners, test/test_nn.py::TestNN::test_interpolate_trilinear_scale_3d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_scale_3d_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_align_corners, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_cuda, test/test_nn.py::TestNN::test_interpolate_undefined_behavior_casting, test/test_nn.py::TestNN::test_kl_div_log_softmax_target, test/test_nn.py::TestNN::test_kl_div_with_diff_type, test/test_nn.py::TestNN::test_kl_div_with_diff_type_log_target, test/test_nn.py::TestNN::test_l1_loss_correct, test/test_nn.py::TestNN::test_layer_norm_backwards_eps, test/test_nn.py::TestNN::test_layer_norm_eps, test/test_nn.py::TestNN::test_layer_norm_grads_with_create_graph_flag, test/test_nn.py::TestNN::test_layer_norm_large_tensor, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_bias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_bias_weightCSC, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_bias_weightCSR, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_bias_weightStrided, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightCSC, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightCSR, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightStrided, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightCSC, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightCSR, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightStrided, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_nobias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_nobias_weightCSC, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_nobias_weightCSR, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_nobias_weightStrided, test/test_nn.py::TestNN::test_linear_broadcasting, test/test_nn.py::TestNN::test_linear_raise_on_scalar_input, test/test_nn.py::TestNN::test_log_softmax_dim0, test/test_nn.py::TestNN::test_log_softmax_dim0_cuda, test/test_nn.py::TestNN::test_log_softmax_dim3, test/test_nn.py::TestNN::test_log_softmax_dim3_cuda, test/test_nn.py::TestNN::test_log_softmax_lastdim, test/test_nn.py::TestNN::test_log_softmax_lastdim_cuda, test/test_nn.py::TestNN::test_log_softmax_scalar, test/test_nn.py::TestNN::test_log_softmax_scalar_cuda, test/test_nn.py::TestNN::test_log_softmax_spatial, test/test_nn.py::TestNN::test_log_softmax_spatial_cuda, test/test_nn.py::TestNN::test_log_softmax_spatial_special, test/test_nn.py::TestNN::test_log_softmax_spatial_special_cuda, test/test_nn.py::TestNN::test_loss_equal_input_target_shape, test/test_nn.py::TestNN::test_margin_ranking_loss_margin_no_reduce, test/test_nn.py::TestNN::test_margin_ranking_loss_no_reduce, test/test_nn.py::TestNN::test_max_pool1d_invalid_output_size, test/test_nn.py::TestNN::test_module_apply_inplace_op, test/test_nn.py::TestNN::test_module_backcompat, test/test_nn.py::TestNN::test_module_super_init, test/test_nn.py::TestNN::test_module_to_argparse, test/test_nn.py::TestNN::test_modules, test/test_nn.py::TestNN::test_mse_loss_size_warning, test/test_nn.py::TestNN::test_multimarginloss_1d_input_0d_target_no_reduce, test/test_nn.py::TestNN::test_multimarginloss_1d_input_0d_target_no_reduce_cuda, test/test_nn.py::TestNN::test_named_children, test/test_nn.py::TestNN::test_named_modules, test/test_nn.py::TestNN::test_named_parameters_remove_duplicate, test/test_nn.py::TestNN::test_native_channel_shuffle_return_alias_of_self, test/test_nn.py::TestNN::test_nested_tensor_from_mask, test/test_nn.py::TestNN::test_nested_tensor_from_mask_error, test/test_nn.py::TestNN::test_no_grad, test/test_nn.py::TestNN::test_non_leaf_parameters, test/test_nn.py::TestNN::test_normalize, test/test_nn.py::TestNN::test_overwrite_module_params_on_conversion, test/test_nn.py::TestNN::test_pack_sequence_batch_sizes_throw, test/test_nn.py::TestNN::test_pad_scalar_error, test/test_nn.py::TestNN::test_padding_list, test/test_nn.py::TestNN::test_pairwise_distance, test/test_nn.py::TestNN::test_parameter_assignment, test/test_nn.py::TestNN::test_parameterlistdict_pickle, test/test_nn.py::TestNN::test_parameterlistdict_setting_attributes, test/test_nn.py::TestNN::test_parameters_and_named_parameters, test/test_nn.py::TestNN::test_parameters_to_vector, test/test_nn.py::TestNN::test_parse_to, test/test_nn.py::TestNN::test_partial_flat_weights, test/test_nn.py::TestNN::test_pdist, test/test_nn.py::TestNN::test_pdist_cpu_gradgrad_unimplemented, test/test_nn.py::TestNN::test_pdist_cuda_gradgrad_unimplemented, test/test_nn.py::TestNN::test_pdist_empty_col, test/test_nn.py::TestNN::test_pdist_empty_row, test/test_nn.py::TestNN::test_pdist_large, test/test_nn.py::TestNN::test_pdist_zeros, test/test_nn.py::TestNN::test_pickle_module_no_weights_only_warning, test/test_nn.py::TestNN::test_pixel_shuffle_nhwc_cpu, test/test_nn.py::TestNN::test_pixel_shuffle_unshuffle, test/test_nn.py::TestNN::test_pointwise_loss_broadcast, test/test_nn.py::TestNN::test_pointwise_loss_target_grad_none_reduction, test/test_nn.py::TestNN::test_projections_errors_on_gru_and_rnn, test/test_nn.py::TestNN::test_projections_lstm_args_check, test/test_nn.py::TestNN::test_projections_lstm_check_device, test/test_nn.py::TestNN::test_projections_lstm_initial_hidden_state, test/test_nn.py::TestNN::test_register_buffer_allows_overwriting_with_same_name, test/test_nn.py::TestNN::test_register_buffer_allows_tensor_like_object, test/test_nn.py::TestNN::test_register_buffer_raises_error_if_attr_exists, test/test_nn.py::TestNN::test_register_buffer_raises_error_if_name_is_not_string, test/test_nn.py::TestNN::test_register_buffer_raises_error_if_not_tensor, test/test_nn.py::TestNN::test_register_parameter_allows_overwriting_with_same_name, test/test_nn.py::TestNN::test_register_parameter_raises_error_if_attr_exists, test/test_nn.py::TestNN::test_register_parameter_raises_error_if_name_is_not_string, test/test_nn.py::TestNN::test_relu_inplace_on_view, test/test_nn.py::TestNN::test_repr, test/test_nn.py::TestNN::test_requires_grad_, test/test_nn.py::TestNN::test_rnn_args_check, test/test_nn.py::TestNN::test_rnn_check_device, test/test_nn.py::TestNN::test_rnn_initial_hidden_state, test/test_nn.py::TestNN::test_rnn_weight_norm, test/test_nn.py::TestNN::test_set_submodule, test/test_nn.py::TestNN::test_share_memory, test/test_nn.py::TestNN::test_smoothl1loss_intergral_target, test/test_nn.py::TestNN::test_smoothl1loss_negative_beta_not_supported, test/test_nn.py::TestNN::test_softmax_functional_dim0, test/test_nn.py::TestNN::test_softmax_functional_dim0_cuda, test/test_nn.py::TestNN::test_softmax_functional_dim3, test/test_nn.py::TestNN::test_softmax_functional_dim3_cuda, test/test_nn.py::TestNN::test_softmax_functional_scalar, test/test_nn.py::TestNN::test_softmax_functional_scalar_cuda, test/test_nn.py::TestNN::test_softmax_lastdim, test/test_nn.py::TestNN::test_softmax_lastdim_cuda, test/test_nn.py::TestNN::test_softmax_lastdim_dtype, test/test_nn.py::TestNN::test_softmax_lastdim_dtype_cuda, test/test_nn.py::TestNN::test_softmax_spatial, test/test_nn.py::TestNN::test_softmax_spatial_cuda, test/test_nn.py::TestNN::test_softmax_spatial_dtype, test/test_nn.py::TestNN::test_softmax_spatial_dtype_cuda, test/test_nn.py::TestNN::test_softmax_spatial_special, test/test_nn.py::TestNN::test_softmax_spatial_special_cuda, test/test_nn.py::TestNN::test_softmin, test/test_nn.py::TestNN::test_spectral_norm, test/test_nn.py::TestNN::test_spectral_norm_dim, test/test_nn.py::TestNN::test_spectral_norm_forward, test/test_nn.py::TestNN::test_spectral_norm_load_state_dict, test/test_nn.py::TestNN::test_spectral_norm_pickle, test/test_nn.py::TestNN::test_state_dict, test/test_nn.py::TestNN::test_swap_module_params_poisons_acc_grad, test/test_nn.py::TestNN::test_sync_batchnorm_accuracy_cuda, test/test_nn.py::TestNN::test_sync_batchnorm_backward_elemt, test/test_nn.py::TestNN::test_threshold_bfloat16_half, test/test_nn.py::TestNN::test_threshold_int, test/test_nn.py::TestNN::test_to, test/test_nn.py::TestNN::test_train_errors_for_invalid_mode, test/test_nn.py::TestNN::test_transformer_args_check, test/test_nn.py::TestNN::test_transformer_layer_args_check, test/test_nn.py::TestNN::test_transformerdecoder, test/test_nn.py::TestNN::test_transformerdecoderlayer, test/test_nn.py::TestNN::test_transformerdecoderlayer_gelu, test/test_nn.py::TestNN::test_triplet_margin_loss, test/test_nn.py::TestNN::test_triplet_margin_loss_no_reduce, test/test_nn.py::TestNN::test_triplet_margin_loss_swap, test/test_nn.py::TestNN::test_triplet_margin_loss_swap_no_reduce, test/test_nn.py::TestNN::test_type, test/test_nn.py::TestNN::test_unflatten, test/test_nn.py::TestNN::test_unflatten_invalid_arg, test/test_nn.py::TestNN::test_unfold_invalid_arg, test/test_nn.py::TestNN::test_upsamplingBilinear2d_spatial_invariance, test/test_nn.py::TestNN::test_upsamplingLinear1d, test/test_nn.py::TestNN::test_upsamplingLinear1d_spatial_invariance, test/test_nn.py::TestNN::test_upsamplingTrilinear3d_spatial_invariance, test/test_nn.py::TestNN::test_upsampling_bfloat16, test/test_nn.py::TestNN::test_upsampling_not_recompute_scale_factor, test/test_nn.py::TestNN::test_upsampling_small_scale, test/test_nn.py::TestNN::test_vector_to_parameters, test/test_nn.py::TestNN::test_weight_norm, test/test_nn.py::TestNN::test_weight_norm_pickle, test/test_nn.py::TestNN::test_weighted_huber_loss, test/test_nn.py::TestNN::test_weighted_l1_loss_with_weights, test/test_nn.py::TestNN::test_weighted_mse_loss, test/test_nn.py::TestNN::test_zero_grad, test/test_nn.py::TestFusionEval::test_fuse_module_eval_numerics, test/test_nn.py::TestConstantPadNd::test_constant_pad_nd, test/test_nn.py::TestConstantPadNd::test_preserves_memory_format, test/test_nn.py::TestAddRelu::test_add_relu, test/test_nn.py::TestAddRelu::test_add_relu_broadcasting, test/test_nn.py::TestFunctionalPickle::test_pickle_softsign, test/test_nn.py::TestFusionUtils::test_fuse_conv_bn_requires_grad, test/test_nn.py::TestFusionUtils::test_fuse_linear_bn_requires_grad, test/test_nn.py::TestUtils::test_consume_prefix_in_state_dict_if_present, test/test_nn.py::TestNNDeviceTypeCUDA::test_BatchNorm_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_Bilinear_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_cudnn_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_empty_target_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_mean_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_mean_use_module_form_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_none_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_none_use_module_form_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_sum_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_sum_use_module_form_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GRU_grad_and_gradgrad_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_memory_format_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_numeric_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_raises_error_if_one_value_per_group_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_InstanceNorm1d_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_InstanceNorm2d_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_InstanceNorm3d_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_LSTM_differentiable_backward_using_oneDNN_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_LSTM_differentiable_backward_using_oneDNN_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_LSTM_grad_and_gradgrad_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_LayerNorm_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_LayerNorm_numeric_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_LocalResponseNorm_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_empty_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_race_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_race_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_warnings_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad2d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad2d_large_deterministic_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad3d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad_empty_cuda_complex64, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad_empty_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad_fails_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad1d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad2d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad3d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad_empty_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerDecoderLayer_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerDecoder_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerEncoderLayer_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerEncoder_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_Transformer_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_Unfold_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_activations_bfloat16_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_activations_bfloat16_half_cpu_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_activations_bfloat16_half_cpu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_adaptiveavg_pool1d_shmem_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotate0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotate45_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotate90_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotateRandom_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_3d_rotateRandom_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_avg_pool_large_tensor2_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_avg_pool_large_tensor_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_mixed_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_mixed_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_mixed_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_mixed_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_grad_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_large_batch_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_large_batch_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_mixed_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_mixed_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_update_stats_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_channel_shuffle_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_error_if_nonfinite_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_0_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_1_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_2_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_4_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_inf_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_0_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_1_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_2_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_4_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_inf_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_multi_device_foreach_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_multi_device_foreach_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_value_foreach_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_value_foreach_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_64bit_reduction_mean_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_64bit_reduction_none_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_64bit_reduction_sum_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_consistent_index_target_and_probs_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_errors_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_weight_ignore_indices_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_with_probs_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_mean_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_none_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_sum_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_2d_out_of_bounds_class_index_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_2d_out_of_bounds_class_index_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_index_target_unit_weights_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_one_hot_target_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_all_reductions_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_mean_weighted_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_mean_weighted_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_none_weighted_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_none_weighted_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_sum_weighted_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_sum_weighted_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_unit_weights_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_cudnn_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_cudnn_tensor_cpu_length_cuda_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_cudnn_tensor_cuda_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_error_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cudnn_rnn_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_cudnn_rnn_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_device_mask_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_elu_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_elu_inplace_with_neg_alpha_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_fold_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_glu_bfloat16_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_bfloat16_precision_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_half_precision_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_2d_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_2d_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_3d_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_3d_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_nan_inf_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_nan_inf_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_groupnorm_nhwc_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_groupnorm_nhwc_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_groupnorm_nhwc_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_groupnorm_nhwc_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_gumbel_softmax_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_gumbel_softmax_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_gumbel_softmax_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardsigmoid_grad_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_corner_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_corner_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_corner_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_for_single_spatial_element_during_training_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm1d_no_batch_dim_False_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm1d_no_batch_dim_False_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm1d_no_batch_dim_True_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm1d_no_batch_dim_True_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_False_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_False_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_True_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_True_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_False_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_False_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_True_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_True_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_less_than_one_value_per_channel_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_invalid_reduction_strings_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_layernorm_half_precision_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_layernorm_weight_bias_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_leaky_relu_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_leaky_relu_inplace_with_neg_slope_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_leaky_relu_inplace_with_zero_slope_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_linear_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_big_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_big_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_cpu_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_cpu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_logsigmoid_out_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_lstmcell_backward_only_one_output_grad_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_TxT_layout_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_devices_parity_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_forward_with_nans_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_grad_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_lowp_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_lowp_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_mask_types_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_transformer_layout_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_mish_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_module_to_empty_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_module_to_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_module_to_empty_non_recursive_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_mse_loss_error_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_1d_input_1d_target_invalid_size_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_all_ignored_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_byte_target_matches_long_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_empty_tensor_reduction_mean_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_empty_tensor_reduction_none_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_empty_tensor_reduction_sum_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_invalid_target_dim_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_invalid_weights_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_large_tensor_reduction_mean_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_large_tensor_reduction_none_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_large_tensor_reduction_sum_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_mismatched_batch_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_out_of_bounds_ignore_index_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_total_weight_is_zero_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nn_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nn_scalars_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nn_scalars_reductions_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nonlinearity_propagate_nan_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_one_hot_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_overwrite_module_params_on_conversion_cpu_device_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_pad_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_pad_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_prelu_backward_32bit_indexing_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_replicatepad_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_numeric_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_numeric_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_fused_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_fused_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_retain_variables_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_retain_variables_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_retain_variables_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rrelu_bounds_validation_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_save_lstm_compatibility_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_silu_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_skip_init_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_smooth_l1_loss_bfloat16_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_smooth_l1_loss_vs_huber_loss_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_smoothl1loss_backward_zero_beta_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_64bit_indexing_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_smem_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_unaligned_grad_output_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_unaligned_output_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_without_fully_vectorized_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_bfloat16_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_cpu_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_cpu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_double_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_forward_64bit_indexing_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_results_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_results_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_softplus_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softplus_low_threshold_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softshrink_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softshrink_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softshrink_negative_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_threshold_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_complex64, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_fast_path_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_gelu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_gelu_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_triplet_margin_with_distance_loss_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_triplet_margin_with_distance_loss_default_parity_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_False_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_False_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_True_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_True_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_False_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_False_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_True_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_True_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bicubic_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_True_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_True_mode_bicubic_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_True_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_True_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bicubic_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bicubic_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest-exact_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBicubic2d_aa_correctness_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBicubic2d_aa_correctness_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBicubic2d_correctness_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBilinear2d_aa_correctness_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBilinear2d_aa_correctness_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_correctness_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_correctness_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_correctness_memory_format0_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_correctness_memory_format1_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_launch_fail_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_launch_rocm_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_memory_format0_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_memory_format0_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_memory_format1_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_memory_format1_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_correctness_memory_format0_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_correctness_memory_format1_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format0_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format0_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format1_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format1_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact1d_correctness_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact1d_correctness_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact1d_rescale_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format0_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format1_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact3d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact3d_correctness_memory_format0_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact3d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact3d_correctness_memory_format1_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingTrilinear3d_align_corners_False_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingTrilinear3d_align_corners_False_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingTrilinear3d_align_corners_True_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingTrilinear3d_align_corners_True_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsampling_64bit_indexing_channels_last_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsampling_64bit_indexing_channels_last_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingnearest2d_backward_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_variable_sequence_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_variable_sequence_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_variable_sequence_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_warp_softmax_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_warp_softmax_64bit_indexing_cuda_float32 2025-10-10T01:58:52.0356329Z 2025-10-10T01:58:52.0356578Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T01:58:52.0356999Z Uploading artifacts took 0.00 seconds 2025-10-10T01:58:52.0357387Z Running test_privateuseone_python_backend 1/1 ... [2025-10-10 01:58:51.825979] 2025-10-10T01:58:52.0357789Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:58:52.0358713Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_privateuseone_python_backend.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:51.826573] 2025-10-10T01:58:55.1009966Z 2025-10-10T01:58:55.1011593Z test_privateuseone_python_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_privateuseone_python_backend_1.1_6fb6ca2953b4730c_.log 2025-10-10T01:58:55.1014331Z Running 2 items in this shard: test/test_privateuseone_python_backend.py::PrivateUse1BackendTest::test_accessing_is_pinned, test/test_privateuseone_python_backend.py::PrivateUse1BackendTest::test_backend_simple 2025-10-10T01:58:55.1015865Z 2025-10-10T01:58:55.1016225Z Running test_spectral_ops 1/1 ... [2025-10-10 01:58:55.101203] 2025-10-10T01:58:55.1016920Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T01:58:55.1023696Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_spectral_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:55.101784] 2025-10-10T02:02:25.7993692Z 2025-10-10T02:02:25.7995140Z test_spectral_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_spectral_ops_1.1_d1a97a2ad2a431a5_.log 2025-10-10T02:02:25.8190175Z Running 347 items in this shard: test/test_spectral_ops.py::TestFFTCUDA::test_batch_istft_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_complex_istft_real_equiv_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_definition_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_onesided_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_real_equiv_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_roundtrip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_roundtrip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_context_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_context_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_plan_cache_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_ifft_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_fftn_equivalence_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_fftn_equivalence_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_invalid_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_numpy_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_ifft_rfft_irfft_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_input_modification_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_invalid_dtypes_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_plan_repeatable_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_int8, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_numpy_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_out_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_out_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_frequencies_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_frequencies_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_against_librosa_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_linearity_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_of_sine_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_requires_window_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_simple_cases_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_various_params_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_with_padding_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_throws_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_fftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_hfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_ifftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_irfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_fftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_hfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_ifftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_irfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_align_to_window_only_requires_non_center_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_requires_complex_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_requires_window_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_roundtrip_complex_window_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_stft_roundtrip_complex_window_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_window_device_cuda 2025-10-10T02:02:25.8358204Z 2025-10-10T02:02:25.8358629Z Running functorch/test_memory_efficient_fusion 1/1 ... [2025-10-10 02:02:25.800112] 2025-10-10T02:02:25.8359396Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:02:25.8361088Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_memory_efficient_fusion.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:25.800731] 2025-10-10T02:02:41.4052607Z 2025-10-10T02:02:41.4054525Z functorch/test_memory_efficient_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_memory_efficient_fusion_1.1_5a6099b52d19d0ad_.log 2025-10-10T02:02:41.4072151Z Running 22 items in this shard: test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_gelu_bias, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_hard_sigmoid, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_hard_swish, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_layer_norm, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_mish, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_rmsnorm, test/functorch/test_memory_efficient_fusion.py::TestMemoryEfficientOpAuthoring::test_swish, test/functorch/test_memory_efficient_fusion.py::NoChangeTestCase::test_empty, test/functorch/test_memory_efficient_fusion.py::NoChangeTestCase::test_hash_with_numbers, test/functorch/test_memory_efficient_fusion.py::NoChangeTestCase::test_nochange, test/functorch/test_memory_efficient_fusion.py::NoChangeTestCase::test_rand_like, test/functorch/test_memory_efficient_fusion.py::NoChangeTestCase::test_rand_n, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_immutable_list_multiple_entries, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_immutable_list_type, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_kwarg, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_nested_immutable_list_type, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_simple, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_simple_2, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_simple_multiple_same_ops, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_two_args, test/functorch/test_memory_efficient_fusion.py::ReduceTestCase::test_two_args_default, test/functorch/test_memory_efficient_fusion.py::RandomOpTestCase::test_random 2025-10-10T02:02:41.4084959Z 2025-10-10T02:02:41.4085174Z Running nn/test_convolution 1/1 ... [2025-10-10 02:02:41.405338] 2025-10-10T02:02:41.4085608Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:02:41.4087274Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_convolution.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:41.405858] 2025-10-10T02:08:02.4442095Z 2025-10-10T02:08:02.4443607Z nn/test_convolution 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_convolution_1.1_f49b9ee4a5d2d4e8_.log 2025-10-10T02:08:02.5015136Z Running 610 items in this shard: test/nn/test_convolution.py::TestConvolutionNN::test_Conv1d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_1x1, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_OneDNN, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_backward_twice, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_groups_nobias, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_groups_nobias_v2, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_inconsistent_types, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_inconsistent_types_on_GPU_with_cudnn, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_inconsistent_types_on_GPU_without_cudnn, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_missing_argument, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_Conv3d_groups_nobias, test/nn/test_convolution.py::TestConvolutionNN::test_Conv3d_groups_wbias, test/nn/test_convolution.py::TestConvolutionNN::test_Conv3d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_ConvTranspose2d_half_cublas_gemm, test/nn/test_convolution.py::TestConvolutionNN::test_ConvTranspose2d_output_size, test/nn/test_convolution.py::TestConvolutionNN::test_ConvTranspose2d_output_size_downsample_upsample, test/nn/test_convolution.py::TestConvolutionNN::test_ConvTranspose3d_correct_output_size, test/nn/test_convolution.py::TestConvolutionNN::test_conv1d_issue_120547, test/nn/test_convolution.py::TestConvolutionNN::test_conv2d_discontiguous_weight, test/nn/test_convolution.py::TestConvolutionNN::test_conv3d_issue_120406, test/nn/test_convolution.py::TestConvolutionNN::test_conv3d_overflow_values, test/nn/test_convolution.py::TestConvolutionNN::test_conv_backcompat, test/nn/test_convolution.py::TestConvolutionNN::test_conv_cudnn_memory_layout_dominance, test/nn/test_convolution.py::TestConvolutionNN::test_conv_invalid_groups, test/nn/test_convolution.py::TestConvolutionNN::test_conv_modules_raise_error_on_incorrect_input_size, test/nn/test_convolution.py::TestConvolutionNN::test_conv_padding_mode, test/nn/test_convolution.py::TestConvolutionNN::test_conv_shapecheck, test/nn/test_convolution.py::TestConvolutionNN::test_conv_tbc, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_non_contiguous, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_noncontiguous_weight, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_not_mutate_stride, test/nn/test_convolution.py::TestConvolutionNN::test_functional_grad_conv, test/nn/test_convolution.py::TestConvolutionNN::test_functional_grad_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv1d_input, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv1d_weight, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv2d_input, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv2d_weight, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv3d_input, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv3d_weight, test/nn/test_convolution.py::TestConvolutionNN::test_grouped_conv_cudnn_nhwc_support, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv1d, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv3d, test/nn/test_convolution.py::TestConvolutionNN::test_mismatch_shape_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_nnpack_conv, test/nn/test_convolution.py::TestConvolutionNN::test_permute_conv2d_issue_120211, test/nn/test_convolution.py::TestConvolutionNN::test_thnn_conv_strided_padded_dilated, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_backward_depthwise_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_backward_depthwise_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_depthwise_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_depthwise_naive_groups_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_depthwise_naive_groups_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_large_workspace_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_large_workspace_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_large_workspace_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_large_workspace_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_size_1_kernel_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv3d_depthwise_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv3d_depthwise_naive_groups_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv3d_depthwise_naive_groups_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose2d_large_output_padding_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose2d_large_output_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose2d_size_1_kernel_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose3d_size_1_kernel_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_contig_wrong_stride_cudnn_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_backward_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_backward_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_same_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_same_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_valid_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_valid_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_no_grad_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_backward_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_backward_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_same_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_same_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_valid_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_valid_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_64bit_indexing_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_cudnn_broken_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_backward_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_backward_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_valid_padding_backward_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_valid_padding_backward_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_valid_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_valid_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_same_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_same_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_valid_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_valid_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_convTranspose_empty_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_contiguous_for_oneDNN_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_mismatch_memory_format_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_ndhwc_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_ndhwc_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_support_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_support_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_support_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_support_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_groups_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_no_bias_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_stride_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_strided_with_3D_input_and_weight_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_empty_channel_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_empty_channel_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_ic1_channels_last_for_oneDNN_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_large_batch_1_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_large_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_large_nosplit_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_noncontig_weights_and_bias_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_noncontig_weights_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_thnn_nhwc_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_thnn_nhwc_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_transpose_with_output_size_and_no_batch_dim_ConvTranspose2d_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_transpose_with_output_size_and_no_batch_dim_ConvTranspose3d_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_transposed_large_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_convert_conv2d_weight_memory_format_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_convert_conv3d_weight_memory_format_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_add_relu_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_add_relu_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_relu_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_relu_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_depthwise_conv_64bit_indexing_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_group_convTranspose_empty_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_group_conv_empty_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_float64 2025-10-10T02:08:02.5257344Z 2025-10-10T02:08:02.5257612Z Running nn/test_pooling 1/1 ... [2025-10-10 02:08:02.448313] 2025-10-10T02:08:02.5257908Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:08:02.5258616Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_pooling.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:08:02.448895] 2025-10-10T02:08:27.4225713Z 2025-10-10T02:08:27.4227428Z nn/test_pooling 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_pooling_1.1_e5e49e248ad573f0_.log 2025-10-10T02:08:27.4321333Z Running 146 items in this shard: test/nn/test_pooling.py::TestAvgPool::test_avg_pool1d_ceil_mode, test/nn/test_pooling.py::TestAvgPool::test_avg_pool2d_ceil_mode, test/nn/test_pooling.py::TestAvgPool::test_avg_pool3d_ceil_mode, test/nn/test_pooling.py::TestAvgPool::test_doubletensor_avg_pool2d, test/nn/test_pooling.py::TestAvgPool::test_doubletensor_avg_pool2d_with_divisor, test/nn/test_pooling.py::TestAvgPool::test_doubletensor_avg_pool3d, test/nn/test_pooling.py::TestAvgPool::test_doubletensor_avg_pool3d_with_divisor, test/nn/test_pooling.py::TestPoolingNN::test_MaxUnpool2d_output_size, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_avg_pooling_nhwc_overflow, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_avg_pooling_overflow, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_avg_nhwc, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_avg_nhwc_launch_config_backward, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_avg_nhwc_launch_config_forward, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_avg_nhwc_non_contiguous, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_lower_precision, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_size_none, test/nn/test_pooling.py::TestPoolingNN::test_adaptive_pooling_size_overflow, test/nn/test_pooling.py::TestPoolingNN::test_max_unpool, test/nn/test_pooling.py::TestPoolingNN::test_max_unpool2d_nhwc_cpu, test/nn/test_pooling.py::TestPoolingNN::test_max_unpool3d_input_check, test/nn/test_pooling.py::TestPoolingNN::test_quantized_max_pool1d_empty_kernel, test/nn/test_pooling.py::TestPoolingNN::test_quantized_max_pool3d, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool1d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool1d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool1d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool1d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool2d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool2d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool2d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool2d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool3d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool3d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool3d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool3d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AdaptiveMaxPool_zero_batch_dim_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AvgPool2d_empty_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_AvgPool3d_backward_after_cat_dim1_device_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool2d_zero_batch_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool2d_zero_out_size_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool2d_zero_samples_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool3d_errors_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool3d_zero_batch_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool3d_zero_out_size_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_FractionalMaxPool3d_zero_samples_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool1d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool1d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool1d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool1d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool2d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool2d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool2d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool2d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool3d_errors_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool3d_indices_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool3d_indices_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool3d_indices_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool3d_indices_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxPool_zero_batch_dim_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case10_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case1_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case2_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case3_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case4_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case5_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case6_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case7_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case8_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_index_errors_case9_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_invalid_output_size_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_MaxUnpool_zero_batch_dim_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_avg_pool2d_output_size_one_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_avg_pool3d_output_size_one_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_avg_pooling_backward_fails_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_max_pooling_backward_fails_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pool_odd_size_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_empty_output_size_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_empty_output_size_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_empty_output_size_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_empty_output_size_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_max_nhwc_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_max_nhwc_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_no_suppot_input_cuda_int16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_no_suppot_input_cuda_int32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_no_suppot_input_cuda_int64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_no_suppot_input_cuda_int8, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_no_suppot_input_cuda_uint8, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_zero_batch_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_adaptive_pooling_zero_batch_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_avg_pool2d_nhwc_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_avg_pool2d_nhwc_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_avg_pool2d_nhwc_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_avg_pool2d_reduced_floating_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_avg_pool2d_reduced_floating_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool2d_backward_fails_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool2d_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool3d_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool_nan_inf_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool_nan_inf_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_fractional_max_pool_nan_inf_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool1d_corner_cases_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool1d_corner_cases_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool1d_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool1d_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_corner_cases_cuda_int32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_corner_cases_cuda_int64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_indices_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_nhwc_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_nhwc_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_nhwc_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool2d_with_indices_backward_fails_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool3d_ndhwc_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool3d_ndhwc_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool3d_ndhwc_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool_bfloat16_half_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool_bfloat16_half_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool_nan_inf_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool_nan_inf_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_pool_nan_inf_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_max_unpool_invalid_indices_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_maxpool3d_non_square_backward_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_maxpool_indices_no_batch_dim_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_maxpool_indices_no_batch_dim_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_maxpool_indices_no_batch_dim_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_maxpool_indices_no_batch_dim_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool3d_large_size_int64_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool3d_size_one_feature_dim_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_invalid_size_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_invalid_size_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_invalid_size_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_invalid_size_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_large_size_cuda_bfloat16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_large_size_cuda_float16, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_large_size_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pool_large_size_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_bfloat16_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_large_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_max_nhwc_cuda_float32, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_max_nhwc_cuda_float64, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_avg_pooling_dims_1_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_avg_pooling_dims_2_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_avg_pooling_dims_3_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_max_pooling_dims_1_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_max_pooling_dims_2_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_shape_kernel_max_pooling_dims_3_cuda, test/nn/test_pooling.py::TestPoolingNNDeviceTypeCUDA::test_pooling_zero_stride_cuda 2025-10-10T02:08:27.4400909Z 2025-10-10T02:08:27.4401189Z Running test_autocast 1/1 ... [2025-10-10 02:08:27.423003] 2025-10-10T02:08:27.4401786Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:08:27.4546316Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_autocast.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:08:27.423624] 2025-10-10T02:08:33.2538573Z 2025-10-10T02:08:33.2540170Z test_autocast 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_autocast_1.1_2beceee2055b65c3_.log 2025-10-10T02:08:33.2565159Z Running 20 items in this shard: test/test_autocast.py::TestAutocastCPU::test_autocast_disabled_with_fp32_dtype, test/test_autocast.py::TestAutocastCPU::test_autocast_methods_expect_builtin_promote, test/test_autocast.py::TestAutocastCPU::test_autocast_nn_16, test/test_autocast.py::TestAutocastCPU::test_autocast_nn_fp32, test/test_autocast.py::TestAutocastCPU::test_autocast_rnn, test/test_autocast.py::TestAutocastCPU::test_autocast_torch_16, test/test_autocast.py::TestAutocastCPU::test_autocast_torch_expect_builtin_promote, test/test_autocast.py::TestAutocastCPU::test_autocast_torch_fp32, test/test_autocast.py::TestAutocastCPU::test_autocast_torch_need_autocast_promote, test/test_autocast.py::TestAutocastCPU::test_cpu_autocast_deprecated_warning, test/test_autocast.py::TestAutocastCPU::test_generic_autocast, test/test_autocast.py::TestAutocastGPU::test_autocast_prioritize, test/test_autocast.py::TestAutocastGPU::test_cache_disabled, test/test_autocast.py::TestAutocastGPU::test_cast_cache_is_global, test/test_autocast.py::TestAutocastMPS::test_cast_cache_is_global, test/test_autocast.py::TestAutocastMPS::test_mps_autocast_bfloat16_supported, test/test_autocast.py::TestAutocastMPS::test_mps_autocast_error_message, test/test_autocast.py::TestTorchAutocast::test_autocast_fast_dtype, test/test_autocast.py::TestTorchAutocast::test_invalid_device, test/test_autocast.py::TestTorchAutocast::test_non_string_device 2025-10-10T02:08:33.2576305Z 2025-10-10T02:08:33.2576717Z Running test_autograd_fallback 1/1 ... [2025-10-10 02:08:33.254158] 2025-10-10T02:08:33.2577760Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:08:33.2579654Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_autograd_fallback.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:08:33.254743] 2025-10-10T02:08:36.7294549Z 2025-10-10T02:08:36.7296155Z test_autograd_fallback 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_autograd_fallback_1.1_a5a150ac68928a51_.log 2025-10-10T02:08:36.7317962Z Running 28 items in this shard: test/test_autograd_fallback.py::TestAutogradFallback::test_autograd_function_registered_to_cpu_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_autograd_function_registered_to_cpu_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_base_does_not_require_grad_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_base_does_not_require_grad_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_composite_registered_to_cpu_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_composite_registered_to_cpu_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_cpu_return_self_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_cpu_return_self_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_inplace_autograd_function_registered_to_cpu_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_inplace_autograd_function_registered_to_cpu_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_inplace_on_tensor_that_does_not_require_grad_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_inplace_on_tensor_that_does_not_require_grad_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_no_autograd_kernel_inplace_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_no_autograd_kernel_inplace_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_no_autograd_kernel_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_no_autograd_kernel_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_no_grad_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_no_grad_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_post_autograd_returns_leaf_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_post_autograd_returns_leaf_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_post_autograd_returns_mix_of_requires_grad_tensors_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_post_autograd_returns_mix_of_requires_grad_tensors_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_supports_tensor_lists_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_supports_tensor_lists_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_undefined_grads_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_undefined_grads_mode_warn, test/test_autograd_fallback.py::TestAutogradFallback::test_undefined_inputs_outputs_mode_nothing, test/test_autograd_fallback.py::TestAutogradFallback::test_undefined_inputs_outputs_mode_warn 2025-10-10T02:08:36.7339374Z 2025-10-10T02:08:36.7339798Z Running test_autoload_disable 1/1 ... [2025-10-10 02:08:36.729681] 2025-10-10T02:08:37.1591243Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions 2025-10-10T02:08:40.5572317Z Preparing metadata (setup.py) ... [?25l- done 2025-10-10T02:08:40.5605940Z [?25hBuilding wheels for collected packages: torch_test_cpp_extension 2025-10-10T02:08:40.5623984Z  DEPRECATION: Building 'torch_test_cpp_extension' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torch_test_cpp_extension'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:08:48.9596545Z  Building wheel for torch_test_cpp_extension (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ done 2025-10-10T02:08:48.9761353Z [?25h Created wheel for torch_test_cpp_extension: filename=torch_test_cpp_extension-0.0.0-cp310-cp310-linux_x86_64.whl size=13001377 sha256=a614debd8a5a03026710bf52e3a828a958fb7303642fe4c9d748885ea06c474c 2025-10-10T02:08:48.9763698Z Stored in directory: /tmp/pip-ephem-wheel-cache-_1s2iz2o/wheels/a9/2e/d7/a9e103243c0b754e2324c4ee6ddd055c388a2eefc520cfc979 2025-10-10T02:08:48.9783671Z Successfully built torch_test_cpp_extension 2025-10-10T02:08:49.2800961Z Installing collected packages: torch_test_cpp_extension 2025-10-10T02:08:49.4814222Z Successfully installed torch_test_cpp_extension-0.0.0 2025-10-10T02:08:51.9087382Z 2025-10-10T02:08:51.9088028Z Running tests... 2025-10-10T02:08:51.9088732Z ---------------------------------------------------------------------- 2025-10-10T02:08:52.1737923Z . 2025-10-10T02:08:52.1738546Z ---------------------------------------------------------------------- 2025-10-10T02:08:52.1739290Z Ran 1 test in 0.265s 2025-10-10T02:08:52.1739587Z 2025-10-10T02:08:52.1739731Z OK 2025-10-10T02:08:52.1740046Z 2025-10-10T02:08:52.1740263Z Generating XML reports... 2025-10-10T02:08:52.6705014Z Running test_cpp_api_parity 1/1 ... [2025-10-10 02:08:52.669781] 2025-10-10T02:08:52.6705891Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:08:52.6711777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_api_parity.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:08:52.670472] 2025-10-10T02:09:57.2384426Z 2025-10-10T02:09:57.2385769Z test_cpp_api_parity 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_api_parity_1.1_b0ac1b801289c490_.log 2025-10-10T02:09:57.2679202Z Running 488 items in this shard: test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCELoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_BCEWithLogitsLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_circular_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_circular_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_groups, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_groups_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad1, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad1_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad1size1, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad1size1_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad2size1, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad2size1_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_same_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_valid, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_pad_valid_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_reflect_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_reflect_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_replicate_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_replicate_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_stride, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_stride_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_zero_batch, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_zero_batch_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_zeros_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv1d_zeros_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_circular_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_circular_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_padded, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_padded_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_strided, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_strided_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_with_multiplier, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_depthwise_with_multiplier_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_groups, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_groups_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_groups_thnn, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_groups_thnn_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_same, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_same_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_same_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_same_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_valid, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_pad_valid_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_padding, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_padding_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_reflect_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_reflect_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_replicate_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_replicate_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_strided, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_strided_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_zero_batch, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_zero_batch_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_zeros_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv2d_zeros_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_1x1x1_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_1x1x1_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_circular_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_circular_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_dilated_strided, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_dilated_strided_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_groups, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_groups_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_same, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_same_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_same_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_same_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_valid, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_pad_valid_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_replicate_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_replicate_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_stride, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_stride_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_stride_padding, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_stride_padding_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_zero_batch, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_zero_batch_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_zeros_stride2_pad2, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Conv3d_zeros_stride2_pad2_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_groups, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_groups_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose1d_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_groups, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_groups_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose2d_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose3d_dilated, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ConvTranspose3d_dilated_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CosineEmbeddingLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CrossMapLRN2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_CrossMapLRN2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_discontiguous, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_discontiguous_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_max, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_max_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_max_padding_idx, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_max_padding_idx_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_mean_padding_idx, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_mean_padding_idx_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sparse, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sparse_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sum_padding_idx, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_EmbeddingBag_sum_padding_idx_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding_discontiguous, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding_discontiguous_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding_sparse, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Embedding_sparse_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Flatten, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Flatten_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Flatten_no_batch_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Flatten_no_batch_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_int_input, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_int_input_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_no_batch_dim_input, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_no_batch_dim_input_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_no_batch_dim_int_input, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Fold_no_batch_dim_int_input_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_HingeEmbeddingLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_LayerNorm_3d_no_affine_large_feature, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_LayerNorm_3d_no_affine_large_feature_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear_no_batch_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear_no_batch_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear_no_bias, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Linear_no_bias_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MarginRankingLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelMarginLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_MultiLabelSoftMarginLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_NLLLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_broadcast_lhs, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_broadcast_lhs_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_broadcast_rhs, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_broadcast_rhs_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_no_batch_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_no_batch_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_with_non_default_args, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PairwiseDistance_with_non_default_args_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PixelShuffle, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PixelShuffle_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PixelUnshuffle, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_PixelUnshuffle_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU_with_up_down, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU_with_up_down_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU_with_up_down_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_RReLU_with_up_down_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d_complex, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d_complex_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d_no_batch_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_ReplicationPad3d_no_batch_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SampleModule_has_parity, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SampleModule_has_parity_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SampleModule_no_parity, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SampleModule_no_parity_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_SoftMarginLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerDecoderLayer_gelu_activation, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerDecoderLayer_gelu_activation_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerDecoderLayer_relu_activation, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerDecoderLayer_relu_activation_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerEncoderLayer_gelu_activation, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerEncoderLayer_gelu_activation_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerEncoderLayer_relu_activation, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TransformerEncoderLayer_relu_activation_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Transformer_multilayer_coder, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Transformer_multilayer_coder_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_mean, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_mean_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_none, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_none_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_sum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_TripletMarginLoss_no_batch_dim_sum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unflatten_no_batch_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unflatten_no_batch_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unfold, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unfold_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unfold_int_input, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_Unfold_int_input_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_weights_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_weights_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_weights_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCELoss_weights_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_legacy_enum, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_legacy_enum_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_BCEWithLogitsLoss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HingeEmbeddingLoss_margin_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HingeEmbeddingLoss_margin_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HingeEmbeddingLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HingeEmbeddingLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HuberLoss_delta, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_HuberLoss_delta_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_log_target, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_log_target_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_scalar_log_target, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_no_reduce_scalar_log_target_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_with_log_target_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_with_log_target_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_with_target_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_KLDivLoss_with_target_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce_complex, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce_complex_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_L1Loss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MSELoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MSELoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MSELoss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MSELoss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_0d_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_0d_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_1d_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_1d_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_index_neg, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_index_neg_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelMarginLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelSoftMarginLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelSoftMarginLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelSoftMarginLoss_weights_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiLabelSoftMarginLoss_weights_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_1d_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_1d_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_margin_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_margin_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_p_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_p_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_weights_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_MultiMarginLoss_weights_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce_ignore_index, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce_ignore_index_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce_weights, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss2d_no_reduce_weights_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce_ignore_index, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce_ignore_index_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce_weights, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLossNd_no_reduce_weights_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_ignore_index, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_ignore_index_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights_ignore_index, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights_ignore_index_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights_ignore_index_neg, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_NLLLoss_no_reduce_weights_ignore_index_neg_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_PoissonNLLLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_PoissonNLLLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_beta, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_beta_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_no_reduce_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_no_reduce_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_zero_beta, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SmoothL1Loss_zero_beta_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SoftMarginLoss_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_SoftMarginLoss_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_2d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_2d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_shared_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_shared_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_skewed_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_skewed_2d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_skewed_2d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_scale_tuple_skewed_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_tuple_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_tuple_2d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_tuple_2d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bicubic_tuple_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_2d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_2d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_shared_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_shared_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_skewed_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_skewed_2d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_skewed_2d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_scale_tuple_skewed_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_tuple_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_tuple_2d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_tuple_2d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_bilinear_tuple_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_1d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_scale_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_scale_1d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_scale_1d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_scale_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_tuple_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_linear_tuple_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_1d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_1d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d_launch_configs, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d_launch_configs_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_2d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_3d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_3d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_scale_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_1d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_1d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_2d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_2d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_nearest_tuple_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_3d_zero_dim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_3d_zero_dim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_scale_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_scale_3d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_scale_3d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_scale_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_tuple_3d, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_tuple_3d_align_corners, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_tuple_3d_align_corners_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_interpolate_trilinear_tuple_3d_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_dim0, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_dim0_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_dim3, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_dim3_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_lastdim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_lastdim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_spatial, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_spatial_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_spatial_special, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_log_softmax_spatial_special_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_multimarginloss_1d_input_0d_target_no_reduce, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_multimarginloss_1d_input_0d_target_no_reduce_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_sample_functional_has_parity, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_sample_functional_has_parity_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_sample_functional_no_parity, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_sample_functional_no_parity_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_dim0, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_dim0_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_dim3, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_dim3_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_scalar, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_functional_scalar_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_lastdim, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_lastdim_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_lastdim_dtype, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_lastdim_dtype_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial_dtype, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial_dtype_cuda, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial_special, test/test_cpp_api_parity.py::TestCppApiParity::test_torch_nn_functional_softmax_spatial_special_cuda 2025-10-10T02:09:57.2936644Z 2025-10-10T02:09:57.2936826Z Running test_cpp_extensions_aot_ninja 1/1 ... [2025-10-10 02:09:57.240171] 2025-10-10T02:09:57.7313982Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions 2025-10-10T02:10:01.2206350Z Preparing metadata (setup.py) ... [?25l- \ done 2025-10-10T02:10:01.2239661Z [?25hBuilding wheels for collected packages: torch_test_cpp_extension 2025-10-10T02:10:01.2257241Z  DEPRECATION: Building 'torch_test_cpp_extension' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torch_test_cpp_extension'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:12:55.0027546Z  Building wheel for torch_test_cpp_extension (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | done 2025-10-10T02:12:55.0192684Z [?25h Created wheel for torch_test_cpp_extension: filename=torch_test_cpp_extension-0.0.0-cp310-cp310-linux_x86_64.whl size=13001612 sha256=77484d9773467afd7d06b6e7c19f06a38e8728a311a7ab4bc7a59f62e9b59763 2025-10-10T02:12:55.0195185Z Stored in directory: /tmp/pip-ephem-wheel-cache-zhdn35ni/wheels/a9/2e/d7/a9e103243c0b754e2324c4ee6ddd055c388a2eefc520cfc979 2025-10-10T02:12:55.0218833Z Successfully built torch_test_cpp_extension 2025-10-10T02:12:55.3265215Z Installing collected packages: torch_test_cpp_extension 2025-10-10T02:12:55.5255295Z Successfully installed torch_test_cpp_extension-0.0.0 2025-10-10T02:12:55.9670211Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions/no_python_abi_suffix_test 2025-10-10T02:12:57.6465012Z Preparing metadata (setup.py) ... [?25l- done 2025-10-10T02:12:57.6502074Z [?25hBuilding wheels for collected packages: no_python_abi_suffix_test 2025-10-10T02:12:57.6526140Z  DEPRECATION: Building 'no_python_abi_suffix_test' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'no_python_abi_suffix_test'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:12:59.8166417Z  Building wheel for no_python_abi_suffix_test (setup.py) ... [?25l- \ | done 2025-10-10T02:12:59.8173014Z [?25h Created wheel for no_python_abi_suffix_test: filename=no_python_abi_suffix_test-0.0.0-cp310-cp310-linux_x86_64.whl size=2944 sha256=6abc420820bb285528086c7d9859fb79a7b1c4f5f365f5b3085ce4b2b257f6dc 2025-10-10T02:12:59.8175307Z Stored in directory: /tmp/pip-ephem-wheel-cache-dytlp797/wheels/01/96/31/d3c48c51cc163420d8b3b57e95a07fda055add3ed0ea48001b 2025-10-10T02:12:59.8217562Z Successfully built no_python_abi_suffix_test 2025-10-10T02:13:00.1423397Z Installing collected packages: no_python_abi_suffix_test 2025-10-10T02:13:00.1467958Z Successfully installed no_python_abi_suffix_test-0.0.0 2025-10-10T02:13:00.2634250Z * Getting build dependencies for wheel... 2025-10-10T02:13:02.0370251Z /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu -> /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu [skipped, no changes] 2025-10-10T02:13:02.0373631Z Successfully preprocessed all matching files. 2025-10-10T02:13:02.0374403Z Total number of unsupported CUDA function calls: 0 2025-10-10T02:13:02.0374861Z 2025-10-10T02:13:02.0374869Z 2025-10-10T02:13:02.0375089Z Total number of replaced kernel launches: 0 2025-10-10T02:13:02.0698689Z running egg_info 2025-10-10T02:13:02.0762368Z creating python_agnostic.egg-info 2025-10-10T02:13:02.0763067Z writing python_agnostic.egg-info/PKG-INFO 2025-10-10T02:13:02.0766776Z writing dependency_links to python_agnostic.egg-info/dependency_links.txt 2025-10-10T02:13:02.0768380Z writing top-level names to python_agnostic.egg-info/top_level.txt 2025-10-10T02:13:02.0769355Z writing manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:02.1666552Z reading manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:02.1683491Z writing manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:02.6326985Z * Building wheel... 2025-10-10T02:13:04.4004688Z /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu -> /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu [skipped, no changes] 2025-10-10T02:13:04.4007211Z Successfully preprocessed all matching files. 2025-10-10T02:13:04.4007983Z Total number of unsupported CUDA function calls: 0 2025-10-10T02:13:04.4008449Z 2025-10-10T02:13:04.4008462Z 2025-10-10T02:13:04.4008687Z Total number of replaced kernel launches: 0 2025-10-10T02:13:04.4236434Z running bdist_wheel 2025-10-10T02:13:04.5061551Z running build 2025-10-10T02:13:04.5062044Z running build_ext 2025-10-10T02:13:04.5094191Z building 'python_agnostic._C' extension 2025-10-10T02:13:04.5099992Z creating /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/build/temp.linux-x86_64-cpython-310/python_agnostic/csrc 2025-10-10T02:13:24.0557710Z [1/1] /opt/rocm/bin/hipcc -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THH -I/opt/rocm/include -I/opt/conda/envs/py_3.10/include/python3.10 -c -c /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu -o /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/build/temp.linux-x86_64-cpython-310/python_agnostic/csrc/ultra_norm.o -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -fPIC -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -DHIP_ENABLE_WARP_SYNC_BUILTINS=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPy_LIMITED_API=0x030A0000 -DTORCH_EXTENSION_NAME=_C --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx950 -fno-gpu-rdc -std=c++17 2025-10-10T02:13:24.0642533Z creating build/lib.linux-x86_64-cpython-310/python_agnostic 2025-10-10T02:13:24.0652012Z g++ -pthread -B /opt/conda/envs/py_3.10/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/py_3.10/include -fPIC -O2 -isystem /opt/conda/envs/py_3.10/include -pthread -B /opt/conda/envs/py_3.10/compiler_compat -shared /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/build/temp.linux-x86_64-cpython-310/python_agnostic/csrc/ultra_norm.o -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/python_agnostic/_C.so 2025-10-10T02:13:24.4432372Z installing to build/bdist.linux-x86_64/wheel 2025-10-10T02:13:24.4433053Z running install 2025-10-10T02:13:24.4520993Z running install_lib 2025-10-10T02:13:24.4650289Z creating build/bdist.linux-x86_64/wheel 2025-10-10T02:13:24.4651874Z creating build/bdist.linux-x86_64/wheel/python_agnostic 2025-10-10T02:13:24.4653907Z copying build/lib.linux-x86_64-cpython-310/python_agnostic/_C.so -> build/bdist.linux-x86_64/wheel/./python_agnostic 2025-10-10T02:13:24.4658113Z running install_egg_info 2025-10-10T02:13:24.4715556Z running egg_info 2025-10-10T02:13:24.4771227Z writing python_agnostic.egg-info/PKG-INFO 2025-10-10T02:13:24.4772717Z writing dependency_links to python_agnostic.egg-info/dependency_links.txt 2025-10-10T02:13:24.4777162Z writing top-level names to python_agnostic.egg-info/top_level.txt 2025-10-10T02:13:24.4836954Z reading manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:24.4844628Z writing manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:24.4845892Z Copying python_agnostic.egg-info to build/bdist.linux-x86_64/wheel/./python_agnostic-0.0-py3.10.egg-info 2025-10-10T02:13:24.4852759Z running install_scripts 2025-10-10T02:13:24.4951972Z creating build/bdist.linux-x86_64/wheel/python_agnostic-0.0.dist-info/WHEEL 2025-10-10T02:13:24.4954035Z creating '/var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/dist/.tmp-qm_qg19u/python_agnostic-0.0-cp39-abi3-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it 2025-10-10T02:13:24.4978412Z adding 'python_agnostic/_C.so' 2025-10-10T02:13:24.4980682Z adding 'python_agnostic-0.0.dist-info/METADATA' 2025-10-10T02:13:24.4981384Z adding 'python_agnostic-0.0.dist-info/WHEEL' 2025-10-10T02:13:24.4987056Z adding 'python_agnostic-0.0.dist-info/top_level.txt' 2025-10-10T02:13:24.4987884Z adding 'python_agnostic-0.0.dist-info/RECORD' 2025-10-10T02:13:24.4988534Z removing build/bdist.linux-x86_64/wheel 2025-10-10T02:13:24.8614423Z Successfully built python_agnostic-0.0-cp39-abi3-linux_x86_64.whl 2025-10-10T02:13:25.2738473Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions/libtorch_agnostic_extension 2025-10-10T02:13:27.5751209Z Preparing metadata (setup.py) ... [?25l- \ done 2025-10-10T02:13:27.5783603Z [?25hRequirement already satisfied: torch in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from libtorch_agnostic==0.0) (2.10.0a0+git344e636) 2025-10-10T02:13:27.5822976Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (3.18.0) 2025-10-10T02:13:27.5826779Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (4.12.2) 2025-10-10T02:13:27.5832077Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (1.13.3) 2025-10-10T02:13:27.5836036Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (2.8.8) 2025-10-10T02:13:27.5841001Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (3.1.6) 2025-10-10T02:13:27.5842990Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (2025.9.0) 2025-10-10T02:13:27.6376540Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch->libtorch_agnostic==0.0) (1.3.0) 2025-10-10T02:13:27.6420918Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch->libtorch_agnostic==0.0) (3.0.3) 2025-10-10T02:13:27.6428740Z Building wheels for collected packages: libtorch_agnostic 2025-10-10T02:13:27.6439237Z  DEPRECATION: Building 'libtorch_agnostic' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'libtorch_agnostic'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:13:31.8499567Z  Building wheel for libtorch_agnostic (setup.py) ... [?25l- \ | / done 2025-10-10T02:13:31.8505247Z [?25h Created wheel for libtorch_agnostic: filename=libtorch_agnostic-0.0-cp39-abi3-linux_x86_64.whl size=34680 sha256=4e13bc864fdea4bb7a54ce9bbd32fe7338e2df348ab9849e282f9fd2b87100ef 2025-10-10T02:13:31.8507405Z Stored in directory: /tmp/pip-ephem-wheel-cache-j0ir5ekx/wheels/0d/08/74/4ba0a92b390e7b767925227eeb64822a849cf3565e6a5de83a 2025-10-10T02:13:31.8540683Z Successfully built libtorch_agnostic 2025-10-10T02:13:32.1294702Z Installing collected packages: libtorch_agnostic 2025-10-10T02:13:32.1357031Z Successfully installed libtorch_agnostic-0.0 2025-10-10T02:13:32.1787081Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:13:32.1793655Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_aot_ninja.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:13:32.178713] 2025-10-10T02:13:35.8682164Z 2025-10-10T02:13:35.8683599Z test_cpp_extensions_aot_ninja 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_aot_ninja_1.1_913e37a3b6715e4f_.log 2025-10-10T02:13:35.8696150Z Running 21 items in this shard: test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_backward, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_cublas_extension, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_cuda_dlink_libs, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_cuda_extension, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_cusolver_extension, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_extension_function, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_extension_module, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_mps_extension, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_no_python_abi_suffix_sets_the_correct_library_name, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_optional, test/test_cpp_extensions_aot_ninja.py::TestCppExtensionAOT::test_sycl_extension, test/test_cpp_extensions_aot_ninja.py::TestPybindTypeCasters::test_pybind_return_types, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_add, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_autocast_apis_for_maia_device, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_conv_backend_override, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_matmul_autocast_default_precision, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_matmul_autocast_float16_precision, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_unregistered, test/test_cpp_extensions_aot_ninja.py::TestMAIATensor::test_zeros, test/test_cpp_extensions_aot_ninja.py::TestRNGExtension::test_rng, test/test_cpp_extensions_aot_ninja.py::TestTorchLibrary::test_torch_library 2025-10-10T02:13:35.8707889Z 2025-10-10T02:13:35.8708313Z Running test_cpp_extensions_aot_no_ninja 1/1 ... [2025-10-10 02:13:35.868650] 2025-10-10T02:13:36.3113490Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions 2025-10-10T02:13:39.6872633Z Preparing metadata (setup.py) ... [?25l- done 2025-10-10T02:13:39.6904983Z [?25hBuilding wheels for collected packages: torch_test_cpp_extension 2025-10-10T02:13:39.6923173Z  DEPRECATION: Building 'torch_test_cpp_extension' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torch_test_cpp_extension'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:13:48.0614879Z  Building wheel for torch_test_cpp_extension (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ done 2025-10-10T02:13:48.0806473Z [?25h Created wheel for torch_test_cpp_extension: filename=torch_test_cpp_extension-0.0.0-cp310-cp310-linux_x86_64.whl size=13001377 sha256=fc23bab78bc1288439034023eeabc82d9d3756aeab289a1ce9de390178a8941d 2025-10-10T02:13:48.0809117Z Stored in directory: /tmp/pip-ephem-wheel-cache-tm_4fqaf/wheels/a9/2e/d7/a9e103243c0b754e2324c4ee6ddd055c388a2eefc520cfc979 2025-10-10T02:13:48.0829959Z Successfully built torch_test_cpp_extension 2025-10-10T02:13:48.3931506Z Installing collected packages: torch_test_cpp_extension 2025-10-10T02:13:48.5886198Z Successfully installed torch_test_cpp_extension-0.0.0 2025-10-10T02:13:49.0220901Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions/no_python_abi_suffix_test 2025-10-10T02:13:50.6822286Z Preparing metadata (setup.py) ... [?25l- done 2025-10-10T02:13:50.6854907Z [?25hBuilding wheels for collected packages: no_python_abi_suffix_test 2025-10-10T02:13:50.6870694Z  DEPRECATION: Building 'no_python_abi_suffix_test' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'no_python_abi_suffix_test'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:13:52.6689905Z  Building wheel for no_python_abi_suffix_test (setup.py) ... [?25l- \ | done 2025-10-10T02:13:52.6696839Z [?25h Created wheel for no_python_abi_suffix_test: filename=no_python_abi_suffix_test-0.0.0-cp310-cp310-linux_x86_64.whl size=2944 sha256=06289f83a2b14581617078a48f73784715182a2738b3da3a17cb8e3c14cbf586 2025-10-10T02:13:52.6699616Z Stored in directory: /tmp/pip-ephem-wheel-cache-8kylkaxv/wheels/01/96/31/d3c48c51cc163420d8b3b57e95a07fda055add3ed0ea48001b 2025-10-10T02:13:52.6719710Z Successfully built no_python_abi_suffix_test 2025-10-10T02:13:52.9921121Z Installing collected packages: no_python_abi_suffix_test 2025-10-10T02:13:52.9964034Z Successfully installed no_python_abi_suffix_test-0.0.0 2025-10-10T02:13:53.1088081Z * Getting build dependencies for wheel... 2025-10-10T02:13:54.8815597Z /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu -> /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu [skipped, no changes] 2025-10-10T02:13:54.8818201Z Successfully preprocessed all matching files. 2025-10-10T02:13:54.8818957Z Total number of unsupported CUDA function calls: 0 2025-10-10T02:13:54.8819426Z 2025-10-10T02:13:54.8819435Z 2025-10-10T02:13:54.8819670Z Total number of replaced kernel launches: 0 2025-10-10T02:13:54.9157680Z running egg_info 2025-10-10T02:13:54.9222526Z writing python_agnostic.egg-info/PKG-INFO 2025-10-10T02:13:54.9227060Z writing dependency_links to python_agnostic.egg-info/dependency_links.txt 2025-10-10T02:13:54.9228145Z writing top-level names to python_agnostic.egg-info/top_level.txt 2025-10-10T02:13:55.0101704Z reading manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:55.0121261Z writing manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:55.4802655Z * Building wheel... 2025-10-10T02:13:57.2419964Z /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu -> /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/python_agnostic/csrc/ultra_norm.cu [skipped, no changes] 2025-10-10T02:13:57.2422483Z Successfully preprocessed all matching files. 2025-10-10T02:13:57.2423238Z Total number of unsupported CUDA function calls: 0 2025-10-10T02:13:57.2423688Z 2025-10-10T02:13:57.2423696Z 2025-10-10T02:13:57.2423935Z Total number of replaced kernel launches: 0 2025-10-10T02:13:57.2657106Z running bdist_wheel 2025-10-10T02:13:57.3501739Z running build 2025-10-10T02:13:57.3502232Z running build_ext 2025-10-10T02:13:57.3534865Z building 'python_agnostic._C' extension 2025-10-10T02:13:57.4680957Z ninja: no work to do. 2025-10-10T02:13:57.4731591Z g++ -pthread -B /opt/conda/envs/py_3.10/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/py_3.10/include -fPIC -O2 -isystem /opt/conda/envs/py_3.10/include -pthread -B /opt/conda/envs/py_3.10/compiler_compat -shared /var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/build/temp.linux-x86_64-cpython-310/python_agnostic/csrc/ultra_norm.o -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/python_agnostic/_C.so 2025-10-10T02:13:57.8260519Z installing to build/bdist.linux-x86_64/wheel 2025-10-10T02:13:57.8261827Z running install 2025-10-10T02:13:57.8305415Z running install_lib 2025-10-10T02:13:57.8381748Z creating build/bdist.linux-x86_64/wheel 2025-10-10T02:13:57.8382623Z creating build/bdist.linux-x86_64/wheel/python_agnostic 2025-10-10T02:13:57.8383842Z copying build/lib.linux-x86_64-cpython-310/python_agnostic/_C.so -> build/bdist.linux-x86_64/wheel/./python_agnostic 2025-10-10T02:13:57.8385108Z running install_egg_info 2025-10-10T02:13:57.8458329Z running egg_info 2025-10-10T02:13:57.8523401Z writing python_agnostic.egg-info/PKG-INFO 2025-10-10T02:13:57.8524361Z writing dependency_links to python_agnostic.egg-info/dependency_links.txt 2025-10-10T02:13:57.8527293Z writing top-level names to python_agnostic.egg-info/top_level.txt 2025-10-10T02:13:57.8596170Z reading manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:57.8606301Z writing manifest file 'python_agnostic.egg-info/SOURCES.txt' 2025-10-10T02:13:57.8607548Z Copying python_agnostic.egg-info to build/bdist.linux-x86_64/wheel/./python_agnostic-0.0-py3.10.egg-info 2025-10-10T02:13:57.8615252Z running install_scripts 2025-10-10T02:13:57.8725597Z creating build/bdist.linux-x86_64/wheel/python_agnostic-0.0.dist-info/WHEEL 2025-10-10T02:13:57.8727733Z creating '/var/lib/jenkins/pytorch/test/cpp_extensions/python_agnostic_extension/dist/.tmp-f4k2p8ll/python_agnostic-0.0-cp39-abi3-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it 2025-10-10T02:13:57.8747500Z adding 'python_agnostic/_C.so' 2025-10-10T02:13:57.8749879Z adding 'python_agnostic-0.0.dist-info/METADATA' 2025-10-10T02:13:57.8750632Z adding 'python_agnostic-0.0.dist-info/WHEEL' 2025-10-10T02:13:57.8751335Z adding 'python_agnostic-0.0.dist-info/top_level.txt' 2025-10-10T02:13:57.8752119Z adding 'python_agnostic-0.0.dist-info/RECORD' 2025-10-10T02:13:57.8752839Z removing build/bdist.linux-x86_64/wheel 2025-10-10T02:13:58.2649690Z Successfully built python_agnostic-0.0-cp39-abi3-linux_x86_64.whl 2025-10-10T02:13:58.6729875Z Processing /var/lib/jenkins/pytorch/test/cpp_extensions/libtorch_agnostic_extension 2025-10-10T02:14:00.9848887Z Preparing metadata (setup.py) ... [?25l- \ done 2025-10-10T02:14:00.9887068Z [?25hRequirement already satisfied: torch in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from libtorch_agnostic==0.0) (2.10.0a0+git344e636) 2025-10-10T02:14:00.9932948Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (3.18.0) 2025-10-10T02:14:00.9936929Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (4.12.2) 2025-10-10T02:14:00.9941292Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (1.13.3) 2025-10-10T02:14:00.9946369Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (2.8.8) 2025-10-10T02:14:00.9949776Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (3.1.6) 2025-10-10T02:14:00.9953281Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->libtorch_agnostic==0.0) (2025.9.0) 2025-10-10T02:14:01.0491389Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch->libtorch_agnostic==0.0) (1.3.0) 2025-10-10T02:14:01.0538490Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch->libtorch_agnostic==0.0) (3.0.3) 2025-10-10T02:14:01.0546410Z Building wheels for collected packages: libtorch_agnostic 2025-10-10T02:14:01.0557121Z  DEPRECATION: Building 'libtorch_agnostic' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'libtorch_agnostic'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T02:14:03.8660203Z  Building wheel for libtorch_agnostic (setup.py) ... [?25l- \ | / done 2025-10-10T02:14:03.8667179Z [?25h Created wheel for libtorch_agnostic: filename=libtorch_agnostic-0.0-cp39-abi3-linux_x86_64.whl size=34680 sha256=723c68fa8692172c5eb8f9b709ea4150ac1fdfc8ebcc525e525a13af432a0e3d 2025-10-10T02:14:03.8669452Z Stored in directory: /tmp/pip-ephem-wheel-cache-nbl0dq_1/wheels/0d/08/74/4ba0a92b390e7b767925227eeb64822a849cf3565e6a5de83a 2025-10-10T02:14:03.8687137Z Successfully built libtorch_agnostic 2025-10-10T02:14:04.1447469Z Installing collected packages: libtorch_agnostic 2025-10-10T02:14:04.1508609Z Successfully installed libtorch_agnostic-0.0 2025-10-10T02:14:04.1953359Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:14:04.1960607Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_aot_no_ninja.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:14:04.195376] 2025-10-10T02:14:07.8967418Z 2025-10-10T02:14:07.8968636Z test_cpp_extensions_aot_no_ninja 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_aot_no_ninja_1.1_19d1c63b931d865d_.log 2025-10-10T02:14:07.8981767Z Running 21 items in this shard: test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_backward, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_cublas_extension, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_cuda_dlink_libs, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_cuda_extension, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_cusolver_extension, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_extension_function, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_extension_module, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_mps_extension, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_no_python_abi_suffix_sets_the_correct_library_name, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_optional, test/test_cpp_extensions_aot_no_ninja.py::TestCppExtensionAOT::test_sycl_extension, test/test_cpp_extensions_aot_no_ninja.py::TestPybindTypeCasters::test_pybind_return_types, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_add, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_autocast_apis_for_maia_device, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_conv_backend_override, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_matmul_autocast_default_precision, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_matmul_autocast_float16_precision, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_unregistered, test/test_cpp_extensions_aot_no_ninja.py::TestMAIATensor::test_zeros, test/test_cpp_extensions_aot_no_ninja.py::TestRNGExtension::test_rng, test/test_cpp_extensions_aot_no_ninja.py::TestTorchLibrary::test_torch_library 2025-10-10T02:14:07.8994044Z 2025-10-10T02:14:07.8994562Z Running test_cpp_extensions_jit 1/1 ... [2025-10-10 02:14:07.895076] 2025-10-10T02:14:07.8995550Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:14:07.8997634Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_jit.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:14:07.895695] 2025-10-10T02:19:42.5373618Z 2025-10-10T02:19:42.5375376Z test_cpp_extensions_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_jit_1.1_c4ae792b5ac39f92_.log 2025-10-10T02:19:42.5398392Z Running 34 items in this shard: test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_aoti_torch_call_dispatcher, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_autograd_from_cpp, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_compilation_error_formatting, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cpp_frontend_module_has_same_output_as_python, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cpp_frontend_module_has_up_to_date_attributes, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cpp_frontend_module_python_inter_op, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cpp_frontend_module_python_inter_op_with_cuda, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cuda_arch_flags_default_gencode, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cuda_arch_flags_non_default_gencode, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_cuda_pluggable_allocator_include, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_custom_compound_op_autograd, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_custom_functorch_error, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_gen_extension_h_pch, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_half_support, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_custom_op_cuda, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_cuda, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_multiple_sources_and_no_functions, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_throws_when_functions_is_bad, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_with_functions_as_dict, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_with_functions_as_list, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_inline_jit_compile_extension_xpu, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_compile_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_cuda_archflags, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_cuda_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_cudnn_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_xpu_archlists, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_jit_xpu_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_lenient_flag_handling_in_jit_extensions, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_load_with_non_platform_default_encoding, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_mps_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_reload_jit_extension, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_returns_shared_library_path_when_is_python_module_is_true, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_set_default_type_also_changes_aten_default_type, test/test_cpp_extensions_jit.py::TestCppExtensionJIT::test_warning 2025-10-10T02:19:42.5419249Z 2025-10-10T02:19:42.5419735Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T02:19:42.5420565Z Uploading artifacts took 0.00 seconds 2025-10-10T02:19:42.5421318Z Running test_cpp_extensions_mtia_backend 1/1 ... [2025-10-10 02:19:42.537650] 2025-10-10T02:19:42.5422331Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:19:42.5424168Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_mtia_backend.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:42.538212] 2025-10-10T02:19:45.5615618Z 2025-10-10T02:19:45.5617768Z test_cpp_extensions_mtia_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_mtia_backend_1.1_8b70a340ffa0a59c_.log 2025-10-10T02:19:45.5623556Z Running 5 items in this shard: test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_device_context, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_get_device_module, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_basic, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_context, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_context_different_device 2025-10-10T02:19:45.5627263Z 2025-10-10T02:19:45.5627744Z Running test_cpp_extensions_stream_and_event 1/1 ... [2025-10-10 02:19:45.561679] 2025-10-10T02:19:45.5628613Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:19:45.5630521Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_stream_and_event.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:45.562245] 2025-10-10T02:19:48.5912057Z 2025-10-10T02:19:48.5914569Z test_cpp_extensions_stream_and_event 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_stream_and_event_1.1_52b62025475f6493_.log 2025-10-10T02:19:48.5916733Z Running 1 items in this shard: test/test_cpp_extensions_stream_and_event.py::TestCppExtensionStreamAndEvent::test_stream_event 2025-10-10T02:19:48.5917816Z 2025-10-10T02:19:48.5918245Z Running test_cuda_primary_ctx 1/1 ... [2025-10-10 02:19:48.591346] 2025-10-10T02:19:48.5919315Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:19:48.5927524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_primary_ctx.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:48.591956] 2025-10-10T02:20:04.8015740Z 2025-10-10T02:20:04.8017278Z test_cuda_primary_ctx 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_primary_ctx_1.1_470a4afbeb8ff6fb_.log 2025-10-10T02:20:04.8019190Z Running 4 items in this shard: test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_copy, test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_pin_memory, test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_set_device_0, test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_str_repr 2025-10-10T02:20:04.8020774Z Running 1 items in this shard: test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_copy 2025-10-10T02:20:04.8021510Z Running 1 items in this shard: test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_pin_memory 2025-10-10T02:20:04.8022283Z Running 1 items in this shard: test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_set_device_0 2025-10-10T02:20:04.8023639Z Running 1 items in this shard: test/test_cuda_primary_ctx.py::TestCudaPrimaryCtx::test_str_repr 2025-10-10T02:20:04.8024309Z 2025-10-10T02:20:04.8024695Z Running test_cuda_trace 1/1 ... [2025-10-10 02:20:04.801449] 2025-10-10T02:20:04.8025484Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:20:04.8027394Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_trace.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:04.801844] 2025-10-10T02:20:46.7651950Z 2025-10-10T02:20:46.7656335Z test_cuda_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_trace_1.1_b430ab2dd462e031_.log 2025-10-10T02:20:46.7659851Z Running 12 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_all_trace_callbacks_called, test/test_cuda_trace.py::TestCudaTrace::test_device_synchronization_callback, test/test_cuda_trace.py::TestCudaTrace::test_event_creation_callback, test/test_cuda_trace.py::TestCudaTrace::test_event_deletion_callback, test/test_cuda_trace.py::TestCudaTrace::test_event_record_callback, test/test_cuda_trace.py::TestCudaTrace::test_event_synchronization_callback, test/test_cuda_trace.py::TestCudaTrace::test_event_wait_callback, test/test_cuda_trace.py::TestCudaTrace::test_memcpy_synchronization, test/test_cuda_trace.py::TestCudaTrace::test_memory_allocation_callback, test/test_cuda_trace.py::TestCudaTrace::test_memory_deallocation_callback, test/test_cuda_trace.py::TestCudaTrace::test_stream_creation_callback, test/test_cuda_trace.py::TestCudaTrace::test_stream_synchronization_callback 2025-10-10T02:20:46.7667548Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_all_trace_callbacks_called 2025-10-10T02:20:46.7668878Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_device_synchronization_callback 2025-10-10T02:20:46.7670174Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_event_creation_callback 2025-10-10T02:20:46.7671454Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_event_deletion_callback 2025-10-10T02:20:46.7672708Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_event_record_callback 2025-10-10T02:20:46.7674439Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_event_synchronization_callback 2025-10-10T02:20:46.7675824Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_event_wait_callback 2025-10-10T02:20:46.7677059Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_memcpy_synchronization 2025-10-10T02:20:46.7678326Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_memory_allocation_callback 2025-10-10T02:20:46.7679623Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_memory_deallocation_callback 2025-10-10T02:20:46.7681124Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_stream_creation_callback 2025-10-10T02:20:46.7682650Z Running 1 items in this shard: test/test_cuda_trace.py::TestCudaTrace::test_stream_synchronization_callback 2025-10-10T02:20:46.7683565Z 2025-10-10T02:20:46.7683898Z Running test_dispatch 1/1 ... [2025-10-10 02:20:46.765663] 2025-10-10T02:20:46.7684397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:20:46.7685465Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_dispatch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:46.766255] 2025-10-10T02:21:25.7784123Z 2025-10-10T02:21:25.7785466Z test_dispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_dispatch_1.1_ad87d9b8e40e4800_.log 2025-10-10T02:21:25.7803268Z Running 32 items in this shard: test/test_dispatch.py::TestDispatch::test_all_invariants, test/test_dispatch.py::TestDispatch::test_computed_table, test/test_dispatch.py::TestDispatch::test_computed_table_with_ambiguous_autogradother, test/test_dispatch.py::TestDispatch::test_computed_table_with_autograd, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_autograd_defaultbackend, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_autograd_math, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_autograd_math_defaultbackend, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_defaultbackend, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_math, test/test_dispatch.py::TestDispatch::test_computed_table_with_cpu_math_autogradcpu_fallthrough, test/test_dispatch.py::TestDispatch::test_computed_table_with_math, test/test_dispatch.py::TestDispatch::test_def, test/test_dispatch.py::TestDispatch::test_def_impl_schema_mismatch, test/test_dispatch.py::TestDispatch::test_def_only, test/test_dispatch.py::TestDispatch::test_def_with_explicit_alias, test/test_dispatch.py::TestDispatch::test_def_with_inference, test/test_dispatch.py::TestDispatch::test_dispatch_print_registrations_for_dispatch_key_invalid, test/test_dispatch.py::TestDispatch::test_find_dangling_impls, test/test_dispatch.py::TestDispatch::test_find_dangling_impls_ext, test/test_dispatch.py::TestDispatch::test_impl_only, test/test_dispatch.py::TestDispatch::test_multiple_def_alias_defaulting, test/test_dispatch.py::TestDispatch::test_multiple_def_alias_mismatch, test/test_dispatch.py::TestDispatch::test_multiple_def_error, test/test_dispatch.py::TestDispatch::test_multiple_fallback, test/test_dispatch.py::TestDispatch::test_overwrite_math, test/test_dispatch.py::TestPythonDispatcher::test_autogradother, test/test_dispatch.py::TestPythonDispatcher::test_basic, test/test_dispatch.py::TestPythonDispatcher::test_defaultbackend_autogradcpu, test/test_dispatch.py::TestPythonDispatcher::test_defaultbackend_math, test/test_dispatch.py::TestPythonDispatcher::test_duplicate_registrations, test/test_dispatch.py::TestPythonDispatcher::test_math_autogradcpu, test/test_dispatch.py::TestPythonDispatcher::test_quantized_structured_not_implemented 2025-10-10T02:21:25.7820402Z 2025-10-10T02:21:25.7820791Z Running test_extension_utils 1/1 ... [2025-10-10 02:21:25.778824] 2025-10-10T02:21:25.7821552Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:21:25.7823671Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_extension_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:25.779473] 2025-10-10T02:21:29.2540284Z 2025-10-10T02:21:29.2541583Z test_extension_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_extension_utils_1.1_2358fa71cde37a42_.log 2025-10-10T02:21:29.2544364Z Running 2 items in this shard: test/test_extension_utils.py::TestExtensionUtils::test_external_module_register, test/test_extension_utils.py::TestExtensionUtils::test_external_module_register_with_renamed_backend 2025-10-10T02:21:29.2546130Z 2025-10-10T02:21:29.2546645Z Running test_jit_disabled 1/1 ... [2025-10-10 02:21:29.254189] 2025-10-10T02:21:29.2547591Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:21:29.2554346Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_disabled.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:29.254827] 2025-10-10T02:21:32.7296812Z 2025-10-10T02:21:32.7298541Z test_jit_disabled 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_disabled_1.1_a78d6dabc44c1803_.log 2025-10-10T02:21:32.7301807Z Running 3 items in this shard: test/test_jit_disabled.py::TestJitDisabled::test_attribute, test/test_jit_disabled.py::TestJitDisabled::test_recursive_script, test/test_jit_disabled.py::TestJitDisabled::test_script_module_construction 2025-10-10T02:21:32.7303677Z 2025-10-10T02:21:32.7304083Z Running test_multiprocessing 1/1 ... [2025-10-10 02:21:32.729864] 2025-10-10T02:21:32.7304870Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:21:32.7310509Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_multiprocessing.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:32.730415] 2025-10-10T02:22:42.1183789Z 2025-10-10T02:22:42.1185190Z test_multiprocessing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_multiprocessing_1.1_74917ac6c7eaa62a_.log 2025-10-10T02:22:42.1213554Z Running 42 items in this shard: test/test_multiprocessing.py::TestMultiprocessing::test_autograd_errors, test/test_multiprocessing.py::TestMultiprocessing::test_autograd_fine_with_spawn, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_bad_call, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_ipc_deadlock, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_memory_allocation, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_parameter_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_send_many, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_simple, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_small_tensors, test/test_multiprocessing.py::TestMultiprocessing::test_cuda_variable_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_empty_shared, test/test_multiprocessing.py::TestMultiprocessing::test_empty_tensor_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_empty_tensor_sharing_cuda, test/test_multiprocessing.py::TestMultiprocessing::test_empty_tensor_sharing_meta, test/test_multiprocessing.py::TestMultiprocessing::test_event, test/test_multiprocessing.py::TestMultiprocessing::test_event_handle_exporter, test/test_multiprocessing.py::TestMultiprocessing::test_event_handle_importer, test/test_multiprocessing.py::TestMultiprocessing::test_event_handle_multi_gpu, test/test_multiprocessing.py::TestMultiprocessing::test_event_multiprocess, test/test_multiprocessing.py::TestMultiprocessing::test_fd_pool, test/test_multiprocessing.py::TestMultiprocessing::test_fd_preserve_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_fd_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_fs, test/test_multiprocessing.py::TestMultiprocessing::test_fs_is_shared, test/test_multiprocessing.py::TestMultiprocessing::test_fs_pool, test/test_multiprocessing.py::TestMultiprocessing::test_fs_preserve_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_fs_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_inherit_tensor, test/test_multiprocessing.py::TestMultiprocessing::test_integer_parameter_serialization_cpu, test/test_multiprocessing.py::TestMultiprocessing::test_integer_parameter_serialization_cuda, test/test_multiprocessing.py::TestMultiprocessing::test_is_shared, test/test_multiprocessing.py::TestMultiprocessing::test_is_shared_cuda, test/test_multiprocessing.py::TestMultiprocessing::test_leaf_variable_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_meta_simple, test/test_multiprocessing.py::TestMultiprocessing::test_mixed_types_cuda_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_non_leaf_variable_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_parameter_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_rebuild_cuda_tensor, test/test_multiprocessing.py::TestMultiprocessing::test_set_thread_name, test/test_multiprocessing.py::TestMultiprocessing::test_tensor_sharing_meta, test/test_multiprocessing.py::TestMultiprocessing::test_variable_sharing, test/test_multiprocessing.py::TestMultiprocessing::test_wrong_cuda_fork 2025-10-10T02:22:42.1235944Z 2025-10-10T02:22:42.1236417Z Running test_namedtuple_return_api 1/1 ... [2025-10-10 02:22:42.118733] 2025-10-10T02:22:42.1237661Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:22:42.1239476Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_namedtuple_return_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:42.119377] 2025-10-10T02:22:46.6969301Z 2025-10-10T02:22:46.6970751Z test_namedtuple_return_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_namedtuple_return_api_1.1_9c8ac612c812e838_.log 2025-10-10T02:22:46.6988854Z Running 3 items in this shard: test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_import_return_types, test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_namedtuple_return, test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_native_functions_yaml 2025-10-10T02:22:46.6990776Z 2025-10-10T02:22:46.6991080Z Running test_native_mha 1/1 ... [2025-10-10 02:22:46.697130] 2025-10-10T02:22:46.6992050Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:22:46.6993760Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_native_mha.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:46.697666] 2025-10-10T02:22:52.3279612Z 2025-10-10T02:22:52.3281864Z test_native_mha 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_native_mha_1.1_279386cb6670e60d_.log 2025-10-10T02:22:52.3348635Z Running 54 items in this shard: test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_attention_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_attention_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_encoder_decoder_attention_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_encoder_decoder_attention_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_False_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_False_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_False_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_False_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_False_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_False_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_True_cuda_float16, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_native_multihead_self_attention_use_nt_True_use_padding_True_pad_all_True_need_weights_False_average_attn_weights_True_fused_True_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_transform_bias_rescale_qkv_cuda_float32, test/test_native_mha.py::TestMHADeviceTypeCUDA::test_transform_bias_rescale_qkv_nested_cuda_float32 2025-10-10T02:22:52.3430194Z 2025-10-10T02:22:52.3430584Z Running test_python_dispatch 1/1 ... [2025-10-10 02:22:52.328403] 2025-10-10T02:22:52.3431312Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:22:52.3433319Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_python_dispatch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:52.328972] 2025-10-10T02:22:58.4603309Z 2025-10-10T02:22:58.4604941Z test_python_dispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_python_dispatch_1.1_9284c476519605a3_.log 2025-10-10T02:22:58.4680060Z Running 119 items in this shard: test/test_python_dispatch.py::TestDispatcherPythonBindings::test_call_boxed, test/test_python_dispatch.py::TestPythonRegistration::test_alias_analysis, test/test_python_dispatch.py::TestPythonRegistration::test_create_new_library, test/test_python_dispatch.py::TestPythonRegistration::test_create_new_library_fragment_no_existing, test/test_python_dispatch.py::TestPythonRegistration::test_create_new_library_fragment_with_existing, test/test_python_dispatch.py::TestPythonRegistration::test_dispatcher_error_filenames, test/test_python_dispatch.py::TestPythonRegistration::test_dispatchkeyset_eq, test/test_python_dispatch.py::TestPythonRegistration::test_dispatchkeyset_pickle, test/test_python_dispatch.py::TestPythonRegistration::test_error_for_unsupported_ns_or_kind, test/test_python_dispatch.py::TestPythonRegistration::test_error_if_fn_not_callable, test/test_python_dispatch.py::TestPythonRegistration::test_extend_library_with_dispatch_key_arg, test/test_python_dispatch.py::TestPythonRegistration::test_fallback, test/test_python_dispatch.py::TestPythonRegistration::test_fallback_fallthrough, test/test_python_dispatch.py::TestPythonRegistration::test_fallback_keyset, test/test_python_dispatch.py::TestPythonRegistration::test_fallthrough_for_dense_key_with_meta_in_tls, test/test_python_dispatch.py::TestPythonRegistration::test_finalizer, test/test_python_dispatch.py::TestPythonRegistration::test_override_aten_ops_with_multiple_libraries, test/test_python_dispatch.py::TestPythonRegistration::test_override_cpu_sum, test/test_python_dispatch.py::TestPythonRegistration::test_override_cuda_with_jiterator, test/test_python_dispatch.py::TestPythonRegistration::test_register_fallthrough, test/test_python_dispatch.py::TestPythonRegistration::test_returning_symint, test/test_python_dispatch.py::TestPythonDispatch::test_all_same_mode, test/test_python_dispatch.py::TestPythonDispatch::test_autograd_in_attr, test/test_python_dispatch.py::TestPythonDispatch::test_basic, test/test_python_dispatch.py::TestPythonDispatch::test_capture_logs_with_torch_dispatch_mode, test/test_python_dispatch.py::TestPythonDispatch::test_construct_int_tensor, test/test_python_dispatch.py::TestPythonDispatch::test_custom_autograd, test/test_python_dispatch.py::TestPythonDispatch::test_custom_dispatch_mode_not_supports_higher_order_operators, test/test_python_dispatch.py::TestPythonDispatch::test_custom_dispatch_mode_supports_higher_order_operators, test/test_python_dispatch.py::TestPythonDispatch::test_custom_size_policy_dynamic_shapes, test/test_python_dispatch.py::TestPythonDispatch::test_data_ptr_respects_numel_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_deepcopy_non_wrapper_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_deepcopy_wrapper_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_deepcopy_wrapper_subclass_with_clone_returning_different_type, test/test_python_dispatch.py::TestPythonDispatch::test_detach_appears_twice_when_called_once, test/test_python_dispatch.py::TestPythonDispatch::test_device_slowpath, test/test_python_dispatch.py::TestPythonDispatch::test_dim_slowpath, test/test_python_dispatch.py::TestPythonDispatch::test_dispatch_super_call, test/test_python_dispatch.py::TestPythonDispatch::test_dispatch_super_call_list_arg, test/test_python_dispatch.py::TestPythonDispatch::test_dispatch_super_dont_autograd, test/test_python_dispatch.py::TestPythonDispatch::test_dispatch_uint64, test/test_python_dispatch.py::TestPythonDispatch::test_error_using_class_method_on_mode, test/test_python_dispatch.py::TestPythonDispatch::test_exception_handling, test/test_python_dispatch.py::TestPythonDispatch::test_fancy_strides, test/test_python_dispatch.py::TestPythonDispatch::test_format, test/test_python_dispatch.py::TestPythonDispatch::test_get_cur_mode, test/test_python_dispatch.py::TestPythonDispatch::test_get_mode_stack, test/test_python_dispatch.py::TestPythonDispatch::test_index_put_where_only_index_is_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_invalid_ret, test/test_python_dispatch.py::TestPythonDispatch::test_is_contiguous_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_kwarg_only, test/test_python_dispatch.py::TestPythonDispatch::test_kwarg_only_and_positional_default, test/test_python_dispatch.py::TestPythonDispatch::test_layout_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_like, test/test_python_dispatch.py::TestPythonDispatch::test_list_ret, test/test_python_dispatch.py::TestPythonDispatch::test_make_fx_with_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_make_subclass_with_modes, test/test_python_dispatch.py::TestPythonDispatch::test_make_wrapper_subclass_noalloc, test/test_python_dispatch.py::TestPythonDispatch::test_make_wrapper_subclass_propagates_metadata, test/test_python_dispatch.py::TestPythonDispatch::test_maybe_tuple_bug, test/test_python_dispatch.py::TestPythonDispatch::test_mode_detection, test/test_python_dispatch.py::TestPythonDispatch::test_mode_with_make_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_multiple_ops_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_nested_push_logging_tensor_mode, test/test_python_dispatch.py::TestPythonDispatch::test_nesting_same_mode, test/test_python_dispatch.py::TestPythonDispatch::test_new_ones, test/test_python_dispatch.py::TestPythonDispatch::test_none_wrapping, test/test_python_dispatch.py::TestPythonDispatch::test_notimplemented_mode, test/test_python_dispatch.py::TestPythonDispatch::test_optional_tensor_list, test/test_python_dispatch.py::TestPythonDispatch::test_out, test/test_python_dispatch.py::TestPythonDispatch::test_produce_real_type, test/test_python_dispatch.py::TestPythonDispatch::test_record_stream, test/test_python_dispatch.py::TestPythonDispatch::test_return_and_correct_aliasing_gives_correct_stride, test/test_python_dispatch.py::TestPythonDispatch::test_return_stream, test/test_python_dispatch.py::TestPythonDispatch::test_set_data, test/test_python_dispatch.py::TestPythonDispatch::test_shallow_copy_and_detach, test/test_python_dispatch.py::TestPythonDispatch::test_sizes_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_standard_is_not_subclass, test/test_python_dispatch.py::TestPythonDispatch::test_storage, test/test_python_dispatch.py::TestPythonDispatch::test_storage_can_be_converted_to_python_object, test/test_python_dispatch.py::TestPythonDispatch::test_strides_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_subclass_creation, test/test_python_dispatch.py::TestPythonDispatch::test_subclass_priority, test/test_python_dispatch.py::TestPythonDispatch::test_sym_sizes_strides_slow_path, test/test_python_dispatch.py::TestPythonDispatch::test_tolist_numpy_with_torch_dispatch_mode, test/test_python_dispatch.py::TestPythonDispatch::test_torch_dispatch_mode_basic, test/test_python_dispatch.py::TestPythonDispatch::test_torch_dispatch_mode_respects_no_dispatch, test/test_python_dispatch.py::TestPythonDispatch::test_torch_dispatch_mode_subclass_priority, test/test_python_dispatch.py::TestPythonDispatch::test_torch_dispatch_mode_unrelated_tensors, test/test_python_dispatch.py::TestPythonDispatch::test_version, test/test_python_dispatch.py::TestPythonDispatch::test_view_returns_alias_under_torch_dispatch, test/test_python_dispatch.py::TestPythonDispatch::test_with_mode_created_separately, test/test_python_dispatch.py::TestPythonDispatch::test_with_nested_modes, test/test_python_dispatch.py::TestPythonDispatch::test_wrapper_subclass_extra_dispatch_keys, test/test_python_dispatch.py::TestPythonDispatch::test_wrapper_subclass_multiprocessing_preserves_dtype, test/test_python_dispatch.py::TestPythonDispatch::test_wrapper_subclass_reentrant_dispatch_with_mode, test/test_python_dispatch.py::TestPythonDispatch::test_wrapper_subclass_serializes, test/test_python_dispatch.py::TestPythonDispatcher::test_basic, test/test_python_dispatch.py::TestPythonDispatcher::test_lstsq, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_cat_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_conv2d_cuda, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyCatCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyCubeCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyMulCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyMulScalarCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyNMSCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyNonzeroCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpySortCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpySplitCopyCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpySplitCopyWithIntCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyTakeCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_custom_NumpyViewCopyCustomOp_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_fft_fft2_cuda, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_mul_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_native_batch_norm_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_out_op_cuda, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_split_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_split_list_args_cuda_float32, test/test_python_dispatch.py::TestWrapperSubclassAliasingCUDA::test_wrapper_subclass_aliasing_view_cuda_float32 2025-10-10T02:22:58.4753699Z 2025-10-10T02:22:58.4754048Z Running test_show_pickle 1/1 ... [2025-10-10 02:22:58.460723] 2025-10-10T02:22:58.4754849Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:22:58.4756602Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_show_pickle.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:58.461310] 2025-10-10T02:23:01.7357144Z 2025-10-10T02:23:01.7358371Z test_show_pickle 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_show_pickle_1.1_479beb2162dc1f98_.log 2025-10-10T02:23:01.7360112Z Running 1 items in this shard: test/test_show_pickle.py::TestShowPickle::test_scripted_model 2025-10-10T02:23:01.7360812Z 2025-10-10T02:23:01.7364787Z Running test_sort_and_select 1/1 ... [2025-10-10 02:23:01.735867] 2025-10-10T02:23:01.7365592Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:23:01.7371478Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sort_and_select.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:01.736426] 2025-10-10T02:23:09.1707970Z 2025-10-10T02:23:09.1712009Z test_sort_and_select 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sort_and_select_1.1_f02543d25c1ad6d8_.log 2025-10-10T02:23:09.1774407Z Running 111 items in this shard: test/test_sort_and_select.py::TestSortAndSelectCUDA::test_complex_unsupported_cpu_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_devices_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_isin_different_dtypes_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_kthvalue_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_kthvalue_scalar_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_msort_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_output_discontiguous_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_parallel_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_parallel_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_parallel_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_parallel_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_1d_parallel_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_discontiguous_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_discontiguous_slow_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_expanded_tensor_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_slice_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_overflow_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_overflow_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_overflow_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_overflow_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_overflow_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_restride_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_stable_none_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_bool, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_against_numpy_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_bool, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_stable_sort_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_1d_output_discontiguous_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_4d_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_arguments_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_integral_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_integral_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_integral_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_integral_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_integral_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_lower_precision_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_lower_precision_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_noncontiguous_gpu_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_nonfinite_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_nonfinite_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_nonfinite_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_nonfinite_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_quantized_scalar_input_cuda, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_bfloat16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_topk_zero_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_bool, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_consecutive_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_bool, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_float16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_float32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_float64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_int16, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_int32, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_int64, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_int8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_cuda_uint8, test/test_sort_and_select.py::TestSortAndSelectCUDA::test_unique_dim_cuda 2025-10-10T02:23:09.1835762Z 2025-10-10T02:23:09.1836159Z Running test_tensor_creation_ops 1/1 ... [2025-10-10 02:23:09.171150] 2025-10-10T02:23:09.1836916Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:23:09.1838741Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensor_creation_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:09.171739] 2025-10-10T02:25:00.9657012Z 2025-10-10T02:25:00.9662787Z test_tensor_creation_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensor_creation_ops_1.1_a34035192a0eb6c8_.log 2025-10-10T02:25:01.0002133Z Running 533 items in this shard: test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_device_vs_cpu_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_device_vs_cpu_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_device_vs_cpu_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_device_vs_cpu_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_inference_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_lowp_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_arange_lowp_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_as_strided_neg_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_as_tensor_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_block_diag_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_block_diag_scipy_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cartesian_prod_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat2_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat2_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat2_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_all_dtypes_and_devices_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_big_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_channels_last_large_inputs_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_empty_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_empty_legacy_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_in_channels_last_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_large_tensor_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_mem_overlap_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_misaligned_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_multi_batch_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_channels_last_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_uint16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_uint32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_uint64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_fast_path_dim0_dim1_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_out_memory_format_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_preserve_channels_last_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_size1_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_stack_cross_devices_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_cat_trailing_dim_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_combinations_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_complex_type_conversions_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_concat_empty_list_error_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_constructor_device_legacy_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_constructor_dtypes_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_ctor_with_numpy_array_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_device_rounding_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_device_rounding_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_device_rounding_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_diag_embed_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_diagflat_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dsplit_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dsplit_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dsplit_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_dstack_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_empty_full_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_empty_overflow_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_empty_strided_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_empty_tensor_props_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_eye_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_fill_all_dtypes_and_devices_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_finite_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_float_to_int_conversion_nonfinite_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_from_file_shared_False_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_from_file_shared_True_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_full_inference_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_full_inference_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_full_inference_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_full_out_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hsplit_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hsplit_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hsplit_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_hstack_column_stack_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_window_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_window_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_window_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_window_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_kaiser_window_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_large_linspace_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_large_linspace_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_like_fn_stride_proparation_vs_tensoriterator_unary_op_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linlogspace_mem_overlap_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_deduction_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_device_vs_cpu_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_special_steps_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_complex_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_integral_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_integral_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_integral_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_integral_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_linspace_vs_numpy_integral_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_base2_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_base2_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_base2_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_deduction_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_device_vs_cpu_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_device_vs_cpu_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_device_vs_cpu_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_special_steps_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_special_steps_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_special_steps_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_vs_numpy_complex_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_vs_numpy_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_logspace_vs_numpy_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_default_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_empty_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_ij_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_ij_indexing_is_default_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_inconsistent_device_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_inconsistent_dtype_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_non_1d_tensor_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_unsupported_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_vs_numpy_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_warns_if_no_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_meshgrid_xy_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_new_empty_strided_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_new_methods_requires_grad_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_new_tensor_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_new_tensor_device_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_offset_scalar_cast_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_ones_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_bool_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_default_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_bool_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_uint16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_uint32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_from_to_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_uint16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_uint32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_full_range_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_uint16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_uint32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_random_to_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_range_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_range_factories_64bit_indexing_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_range_warning_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_refs_tensor_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_repeat_interleave_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_roll_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_bartlett_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_bartlett_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_bartlett_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_bartlett_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_bartlett_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_blackman_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_blackman_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_blackman_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_blackman_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_blackman_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hamming_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hamming_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hamming_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hamming_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hamming_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hann_cuda_bfloat16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hann_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hann_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hann_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_window_functions_window_hann_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_bartlett_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_bartlett_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_blackman_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_blackman_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_cosine_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_cosine_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_hamming_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_hamming_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_hann_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_hann_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_nuttall_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_signal_windows_functions_window_nuttall_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_simple_scalar_cast_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_stack_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_stack_out_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_storage_filename_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_strided_mismatched_stride_shape_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_ctor_device_inference_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_device_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factories_empty_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factory_copy_var_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factory_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factory_gpu_type_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factory_gpu_type_inference_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_factory_type_inference_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_from_non_writable_numpy_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_tensor_from_sequence_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_floating_dtype_error_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_out_dtype_error_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_out_dtype_error_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_same_dtype_error_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_complex_same_dtype_error_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_polar_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_torch_polar_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_unpack_double_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_unpack_double_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vander_types_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vsplit_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vsplit_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vsplit_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_complex128, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_float64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_int32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_int8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_vstack_row_stack_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_bounds_checking_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_cuda, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_bool, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_complex64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_float16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_float32, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_int16, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_int64, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_dtype_layout_device_match_cuda_uint8, test/test_tensor_creation_ops.py::TestTensorCreationCUDA::test_zeros_out_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_normal_cuda_float32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_normal_cuda_float64, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_normal_std_error_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_rand_cuda_complex128, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_rand_cuda_complex32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_rand_cuda_complex64, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_rand_cuda_float32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_rand_cuda_float64, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randint_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randint_distribution_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randint_inference_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_bfloat16, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_complex128, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_complex32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_complex64, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_float16, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_float32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randn_cuda_float64, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_random_neg_values_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randperm_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randperm_device_compatibility_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_randperm_large_cuda, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_uniform_from_to_cuda_bfloat16, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_uniform_from_to_cuda_float16, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_uniform_from_to_cuda_float32, test/test_tensor_creation_ops.py::TestRandomTensorCreationCUDA::test_uniform_from_to_cuda_float64, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_empty_like_cuda, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_full_like_inference_cuda, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_ones_like_cuda, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_ones_like_multiple_device_cuda, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_zeros_like_cuda, test/test_tensor_creation_ops.py::TestLikeTensorCreationCUDA::test_zeros_like_multiple_device_cuda, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_uint16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_uint32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_uint64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_buffer_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_dlpack_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_uint16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_uint32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_uint64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_numpy_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_alias_from_tensor_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_astensor_consistency_cuda, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_uint16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_uint32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_uint64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_buffer_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_dlpack_mult_devices_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_uint16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_uint32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_uint64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_numpy_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_from_tensor_mult_devices_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_list_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_bfloat16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_bool, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_complex128, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_float16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_float64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_int16, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_int32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_int64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_int8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_copy_tensor_cuda_uint8, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_default_device_cuda, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_device_without_index_cuda, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_numpy_scalars_cuda, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_retain_autograd_history_cuda_complex64, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_retain_autograd_history_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_unsupported_alias_cuda_float32, test/test_tensor_creation_ops.py::TestAsArrayCUDA::test_unsupported_alias_mult_devices_cuda_float32 2025-10-10T02:25:01.0245802Z 2025-10-10T02:25:01.0245959Z Running test_tensorexpr 1/1 ... [2025-10-10 02:25:00.967004] 2025-10-10T02:25:01.0246249Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:25:01.0247086Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensorexpr.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:25:00.967624] 2025-10-10T02:25:47.7718045Z 2025-10-10T02:25:47.7719498Z test_tensorexpr 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensorexpr_1.1_a9675c6153f6edb1_.log 2025-10-10T02:25:47.7766083Z Running 74 items in this shard: test/test_tensorexpr.py::TestTensorExprFuser::test_add_const_rhs, test/test_tensorexpr.py::TestTensorExprFuser::test_add_sub, test/test_tensorexpr.py::TestTensorExprFuser::test_alias_analysis_input_and_module, test/test_tensorexpr.py::TestTensorExprFuser::test_alias_analysis_inputs, test/test_tensorexpr.py::TestTensorExprFuser::test_alias_analysis_module, test/test_tensorexpr.py::TestTensorExprFuser::test_all_combos, test/test_tensorexpr.py::TestTensorExprFuser::test_alpha, test/test_tensorexpr.py::TestTensorExprFuser::test_binary_ops, test/test_tensorexpr.py::TestTensorExprFuser::test_bitwise_ops, test/test_tensorexpr.py::TestTensorExprFuser::test_broadcast, test/test_tensorexpr.py::TestTensorExprFuser::test_broadcast3, test/test_tensorexpr.py::TestTensorExprFuser::test_broadcast_2, test/test_tensorexpr.py::TestTensorExprFuser::test_broadcast_big2, test/test_tensorexpr.py::TestTensorExprFuser::test_cat, test/test_tensorexpr.py::TestTensorExprFuser::test_cat_empty_tensors, test/test_tensorexpr.py::TestTensorExprFuser::test_cat_negative_dim, test/test_tensorexpr.py::TestTensorExprFuser::test_cat_only, test/test_tensorexpr.py::TestTensorExprFuser::test_cat_promote_inputs, test/test_tensorexpr.py::TestTensorExprFuser::test_cat_with_constant_dim, test/test_tensorexpr.py::TestTensorExprFuser::test_char, test/test_tensorexpr.py::TestTensorExprFuser::test_chunk, test/test_tensorexpr.py::TestTensorExprFuser::test_clamp, test/test_tensorexpr.py::TestTensorExprFuser::test_constant, test/test_tensorexpr.py::TestTensorExprFuser::test_double, test/test_tensorexpr.py::TestTensorExprFuser::test_double_intrinsics, test/test_tensorexpr.py::TestTensorExprFuser::test_dynamic_shape, test/test_tensorexpr.py::TestTensorExprFuser::test_easy, test/test_tensorexpr.py::TestTensorExprFuser::test_eq, test/test_tensorexpr.py::TestTensorExprFuser::test_exp_pow, test/test_tensorexpr.py::TestTensorExprFuser::test_four_arg, test/test_tensorexpr.py::TestTensorExprFuser::test_ge, test/test_tensorexpr.py::TestTensorExprFuser::test_gt, test/test_tensorexpr.py::TestTensorExprFuser::test_guard_fails, test/test_tensorexpr.py::TestTensorExprFuser::test_half_bn_relu, test/test_tensorexpr.py::TestTensorExprFuser::test_half_gelu, test/test_tensorexpr.py::TestTensorExprFuser::test_int64_promotion, test/test_tensorexpr.py::TestTensorExprFuser::test_int_output, test/test_tensorexpr.py::TestTensorExprFuser::test_le, test/test_tensorexpr.py::TestTensorExprFuser::test_loop, test/test_tensorexpr.py::TestTensorExprFuser::test_lt, test/test_tensorexpr.py::TestTensorExprFuser::test_mask, test/test_tensorexpr.py::TestTensorExprFuser::test_min_max, test/test_tensorexpr.py::TestTensorExprFuser::test_min_max_reduction, test/test_tensorexpr.py::TestTensorExprFuser::test_min_max_reduction2, test/test_tensorexpr.py::TestTensorExprFuser::test_min_max_reduction_dim1, test/test_tensorexpr.py::TestTensorExprFuser::test_min_max_reduction_dim1_2, test/test_tensorexpr.py::TestTensorExprFuser::test_multi_rand, test/test_tensorexpr.py::TestTensorExprFuser::test_multioutput, test/test_tensorexpr.py::TestTensorExprFuser::test_multiple_outputs, test/test_tensorexpr.py::TestTensorExprFuser::test_nans, test/test_tensorexpr.py::TestTensorExprFuser::test_ne, test/test_tensorexpr.py::TestTensorExprFuser::test_promotion, test/test_tensorexpr.py::TestTensorExprFuser::test_propagated_mem_layout, test/test_tensorexpr.py::TestTensorExprFuser::test_rand_like, test/test_tensorexpr.py::TestTensorExprFuser::test_rank_two, test/test_tensorexpr.py::TestTensorExprFuser::test_relu, test/test_tensorexpr.py::TestTensorExprFuser::test_remainder, test/test_tensorexpr.py::TestTensorExprFuser::test_reps, test/test_tensorexpr.py::TestTensorExprFuser::test_round_2, test/test_tensorexpr.py::TestTensorExprFuser::test_scalar, test/test_tensorexpr.py::TestTensorExprFuser::test_short, test/test_tensorexpr.py::TestTensorExprFuser::test_simple_add, test/test_tensorexpr.py::TestTensorExprFuser::test_sin_pow, test/test_tensorexpr.py::TestTensorExprFuser::test_slice, test/test_tensorexpr.py::TestTensorExprFuser::test_sliced_stride, test/test_tensorexpr.py::TestTensorExprFuser::test_softmax_cpu, test/test_tensorexpr.py::TestTensorExprFuser::test_softmax_cuda, test/test_tensorexpr.py::TestTensorExprFuser::test_strided_output_preserved, test/test_tensorexpr.py::TestTensorExprFuser::test_three_arg, test/test_tensorexpr.py::TestTensorExprFuser::test_three_arg2, test/test_tensorexpr.py::TestTensorExprFuser::test_transpose, test/test_tensorexpr.py::TestTensorExprFuser::test_unary_ops, test/test_tensorexpr.py::TestTensorExprFuser::test_unsqueeze, test/test_tensorexpr.py::TestTensorExprFuser::test_where 2025-10-10T02:25:47.7799900Z 2025-10-10T02:25:47.7800209Z Running test_utils 1/1 ... [2025-10-10 02:25:47.772098] 2025-10-10T02:25:47.7800924Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:25:47.7802755Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:25:47.772691] 2025-10-10T02:26:24.0278605Z 2025-10-10T02:26:24.0280914Z test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_1.1_0bea68bca60284ca_.log 2025-10-10T02:26:24.2075616Z Running 6010 items in this shard: test/test_utils.py::TestCheckpoint::test_checkpoint, test/test_utils.py::TestCheckpoint::test_checkpoint_module_list, test/test_utils.py::TestCheckpoint::test_checkpoint_no_tensors, test/test_utils.py::TestCheckpoint::test_checkpoint_non_tensor, test/test_utils.py::TestCheckpoint::test_checkpoint_non_tensor_inputs_outputs, test/test_utils.py::TestCheckpoint::test_checkpoint_not_preserve_rng_state_and_without_reentrant, test/test_utils.py::TestCheckpoint::test_checkpoint_partial_grad, test/test_utils.py::TestCheckpoint::test_checkpoint_rng_cpu, test/test_utils.py::TestCheckpoint::test_checkpoint_rng_cuda, test/test_utils.py::TestCheckpoint::test_checkpoint_sequential_deprecated_multiple_args, test/test_utils.py::TestCheckpoint::test_checkpoint_sequential_deprecated_no_args, test/test_utils.py::TestCheckpoint::test_checkpoint_trigger, test/test_utils.py::TestCheckpoint::test_checkpoint_valid, test/test_utils.py::TestCheckpoint::test_checkpointing_without_reentrant_early_free, test/test_utils.py::TestCheckpoint::test_get_device_states_recursive, test/test_utils.py::TestCheckpoint::test_infer_device_state_recursive_meta, test/test_utils.py::TestCheckpoint::test_infer_device_state_recursive_multi_cuda, test/test_utils.py::TestDataLoaderUtils::test_multi_drop, test/test_utils.py::TestDataLoaderUtils::test_multi_keep, test/test_utils.py::TestDataLoaderUtils::test_random_seed, test/test_utils.py::TestDataLoaderUtils::test_single_drop, test/test_utils.py::TestDataLoaderUtils::test_single_keep, test/test_utils.py::TestCollectEnv::test_smoke, test/test_utils.py::TestHipify::test_import_hipify, test/test_utils.py::TestHipifyTrie::test_add_and_search_trie, test/test_utils.py::TestHipifyTrie::test_add_multiple_and_search_trie, test/test_utils.py::TestHipifyTrie::test_char_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_prefix_words_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_quote_escape, test/test_utils.py::TestHipifyTrie::test_single_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_special_char_export_trie_to_regex, test/test_utils.py::TestAssert::test_assert_scriptable, test/test_utils.py::TestAssert::test_assert_true, test/test_utils.py::TestStandaloneCPPJIT::test_load_standalone, test/test_utils.py::TestRenderUtils::test_basic, test/test_utils.py::TestDeviceUtilsCUDA::test_basic_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_decorator_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_decorator_generator_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_shapes_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e4m3fn, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e4m3fnuz, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e5m2, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e5m2fnuz, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igammac_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igammac_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_istft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_istft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanquantile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanquantile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_ctc_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_ctc_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_one_hot_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pdist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pdist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_complex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_complex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polar_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polar_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_quantile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_quantile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_bartlett_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_bartlett_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_blackman_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_blackman_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_cosine_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_cosine_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_exponential_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_exponential_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_gaussian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_gaussian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_cosine_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_cosine_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_hamming_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_hamming_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hamming_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hamming_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hann_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hann_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_kaiser_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_kaiser_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_nuttall_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_nuttall_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch__scaled_mm_cuda_float8_e4m3fn, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_indices_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_indices_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_indices_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_indices_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_real_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_real_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_get_default_device_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_get_default_device_more_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_nn_module_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_set_default_device_cuda, test/test_utils.py::TestCppExtensionUtils::test_cc_compiler_is_ok, test/test_utils.py::TestCppExtensionUtils::test_cpp_compiler_is_ok, test/test_utils.py::TestTraceback::test_basic, test/test_utils.py::TestTraceback::test_captured_traceback, test/test_utils.py::TestTraceback::test_captured_traceback_format_all, test/test_utils.py::TestTraceback::test_captured_traceback_format_all_cached, test/test_utils.py::TestTraceback::test_format_traceback_short, test/test_utils.py::TestTryImport::test_import_existing, test/test_utils.py::TestTryImport::test_import_imported, test/test_utils.py::TestTryImport::test_import_missing, test/test_utils.py::TestDeprecate::test_deprecated 2025-10-10T02:26:24.3456223Z 2025-10-10T02:26:24.3456425Z Running inductor/test_triton_cpu_backend 1/1 ... [2025-10-10 02:26:24.042006] 2025-10-10T02:26:24.3456774Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:24.3457575Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_cpu_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:24.042595] 2025-10-10T02:26:30.5561317Z 2025-10-10T02:26:30.5562826Z inductor/test_triton_cpu_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_cpu_backend_1.1_da4b0a120a76ede8_.log 2025-10-10T02:26:30.5564146Z 2025-10-10T02:26:30.5564529Z Running dynamo/test_torchrec 1/1 ... [2025-10-10 02:26:30.555725] 2025-10-10T02:26:30.5565276Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:30.5567711Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_torchrec.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:30.556144] 2025-10-10T02:26:33.6834054Z 2025-10-10T02:26:33.6835617Z dynamo/test_torchrec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_torchrec_1.1_41fff7726b75459c_.log 2025-10-10T02:26:33.6836959Z Running 0 items in this shard: 2025-10-10T02:26:33.6837316Z 2025-10-10T02:26:33.6840117Z Running test_ops 7/9 ... [2025-10-10 02:26:33.683622] 2025-10-10T02:26:33.6840789Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:33.6847192Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=7', '--num-shards=9', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:33.684163] 2025-10-10T02:26:47.1836077Z 2025-10-10T02:26:47.1838045Z test_ops 7/9 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_7.9_bf5e9c3fc13f8b36_.log 2025-10-10T02:26:47.1839297Z Running 0 items in this shard: 2025-10-10T02:26:47.1839642Z 2025-10-10T02:26:47.1843441Z Running test_cuda_expandable_segments 1/1 ... [2025-10-10 02:26:47.183782] 2025-10-10T02:26:47.1844332Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:47.1850246Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:47.184365] 2025-10-10T02:26:51.5720252Z 2025-10-10T02:26:51.5721841Z test_cuda_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_expandable_segments_1.1_5b47914673b4dfb7_.log 2025-10-10T02:26:51.5723111Z 2025-10-10T02:26:51.5726058Z Running test_decomp 2/17 ... [2025-10-10 02:26:51.572148] 2025-10-10T02:26:51.5726779Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:51.5733817Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=2', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:51.572730] 2025-10-10T02:26:57.6535535Z 2025-10-10T02:26:57.6536806Z test_decomp 2/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_2.17_07f5036cd8a59b92_.log 2025-10-10T02:26:57.6538075Z Running 0 items in this shard: 2025-10-10T02:26:57.6538420Z 2025-10-10T02:26:57.6542905Z Running test_decomp 3/17 ... [2025-10-10 02:26:57.653763] 2025-10-10T02:26:57.6543654Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:26:57.6549181Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=3', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:26:57.654325] 2025-10-10T02:27:03.6860936Z 2025-10-10T02:27:03.6862472Z test_decomp 3/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_3.17_a026ac633d0cf4e9_.log 2025-10-10T02:27:03.6863841Z Running 0 items in this shard: 2025-10-10T02:27:03.6864196Z 2025-10-10T02:27:03.6868174Z Running test_decomp 14/17 ... [2025-10-10 02:27:03.686302] 2025-10-10T02:27:03.6868952Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:03.6875362Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=14', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:03.686892] 2025-10-10T02:27:09.8179907Z 2025-10-10T02:27:09.8181939Z test_decomp 14/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_14.17_7316919f0ef7a17b_.log 2025-10-10T02:27:09.8183322Z Running 0 items in this shard: 2025-10-10T02:27:09.8183686Z 2025-10-10T02:27:09.8186697Z Running test_decomp 15/17 ... [2025-10-10 02:27:09.818173] 2025-10-10T02:27:09.8187376Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:09.8192577Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=15', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:09.818715] 2025-10-10T02:27:15.8997246Z 2025-10-10T02:27:15.8998784Z test_decomp 15/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.17_a2ab3d9511da100a_.log 2025-10-10T02:27:15.9000023Z Running 0 items in this shard: 2025-10-10T02:27:15.9000370Z 2025-10-10T02:27:15.9003143Z Running test_jit_fuser_te 2/2 ... [2025-10-10 02:27:15.899861] 2025-10-10T02:27:15.9004555Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:15.9010799Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_fuser_te.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:15.900408] 2025-10-10T02:27:22.2321354Z 2025-10-10T02:27:22.2322662Z test_jit_fuser_te 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_fuser_te_2.2_a6cbe23fe082fbe6_.log 2025-10-10T02:27:22.2324032Z Running 0 items in this shard: 2025-10-10T02:27:22.2329737Z 2025-10-10T02:27:22.2330267Z Running test_nestedtensor 1/3 ... [2025-10-10 02:27:22.232388] 2025-10-10T02:27:22.2331060Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:22.2336733Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:22.232954] 2025-10-10T02:27:27.2612437Z 2025-10-10T02:27:27.2613704Z test_nestedtensor 1/3 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.3_83452fa433d878e3_.log 2025-10-10T02:27:27.2615027Z Running 0 items in this shard: 2025-10-10T02:27:27.2616074Z 2025-10-10T02:27:27.2620035Z Running profiler/test_execution_trace 1/1 ... [2025-10-10 02:27:27.261540] 2025-10-10T02:27:27.2620949Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:27.2628055Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_execution_trace.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:27.262118] 2025-10-10T02:27:30.7871001Z 2025-10-10T02:27:30.7872764Z profiler/test_execution_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_execution_trace_1.1_1d023a968faf2430_.log 2025-10-10T02:27:30.7874471Z Running 0 items in this shard: 2025-10-10T02:27:30.7874822Z 2025-10-10T02:27:30.7876464Z Running profiler/test_record_function 1/1 ... [2025-10-10 02:27:30.787188] 2025-10-10T02:27:30.7877274Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:30.7884142Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_record_function.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:30.787778] 2025-10-10T02:27:33.8113570Z 2025-10-10T02:27:33.8115285Z profiler/test_record_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_record_function_1.1_7a31bca357067301_.log 2025-10-10T02:27:33.8116798Z Running 0 items in this shard: 2025-10-10T02:27:33.8117869Z 2025-10-10T02:27:33.8120339Z Running test_sparse_semi_structured 1/1 ... [2025-10-10 02:27:33.811610] 2025-10-10T02:27:33.8121124Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:33.8127461Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_semi_structured.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:33.812146] 2025-10-10T02:27:40.0435389Z 2025-10-10T02:27:40.0436996Z test_sparse_semi_structured 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_semi_structured_1.1_9dace1cc59672d62_.log 2025-10-10T02:27:40.0438695Z Running 0 items in this shard: 2025-10-10T02:27:40.0439082Z 2025-10-10T02:27:40.0441931Z Running functorch/test_aot_joint_with_descriptors 1/1 ... [2025-10-10 02:27:40.043719] 2025-10-10T02:27:40.0442879Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:40.0449289Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aot_joint_with_descriptors.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:40.044285] 2025-10-10T02:27:43.1678645Z 2025-10-10T02:27:43.1680486Z functorch/test_aot_joint_with_descriptors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_8b5b96b59d7b9c35_.log 2025-10-10T02:27:43.1682158Z Running 0 items in this shard: 2025-10-10T02:27:43.1682518Z 2025-10-10T02:27:43.1685809Z Running functorch/test_eager_transforms 1/1 ... [2025-10-10 02:27:43.168098] 2025-10-10T02:27:43.1686662Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:43.1692574Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:43.168624] 2025-10-10T02:27:47.8960265Z 2025-10-10T02:27:47.8961929Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_6f9743462729e8a3_.log 2025-10-10T02:27:47.8964264Z Running 0 items in this shard: 2025-10-10T02:27:47.8964673Z 2025-10-10T02:27:47.8967081Z Running functorch/test_vmap 1/1 ... [2025-10-10 02:27:47.896226] 2025-10-10T02:27:47.8967874Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:47.8973739Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:47.896764] 2025-10-10T02:27:52.8750081Z 2025-10-10T02:27:52.8751545Z functorch/test_vmap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_1.1_fefd6c1dd7a7d423_.log 2025-10-10T02:27:52.8752894Z Running 0 items in this shard: 2025-10-10T02:27:52.8753256Z 2025-10-10T02:27:52.8756657Z Running functorch/test_control_flow 4/5 ... [2025-10-10 02:27:52.875194] 2025-10-10T02:27:52.8757467Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:52.8763961Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_control_flow.py', '-m', 'serial', '--shard-id=4', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:52.875754] 2025-10-10T02:27:57.1019403Z 2025-10-10T02:27:57.1020850Z functorch/test_control_flow 4/5 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_control_flow_4.5_ab10e701a49425de_.log 2025-10-10T02:27:57.1022329Z Running 0 items in this shard: 2025-10-10T02:27:57.1032999Z 2025-10-10T02:27:57.1033400Z Running test_ops_gradients 2/3 ... [2025-10-10 02:27:57.102141] 2025-10-10T02:27:57.1034607Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:27:57.1036438Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'serial', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:27:57.102680] 2025-10-10T02:28:02.7325029Z 2025-10-10T02:28:02.7326453Z test_ops_gradients 2/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_2.3_64851534da12b1c5_.log 2025-10-10T02:28:02.7327791Z Running 0 items in this shard: 2025-10-10T02:28:02.7328141Z 2025-10-10T02:28:02.7331687Z Running test_ops_jit 1/2 ... [2025-10-10 02:28:02.732750] 2025-10-10T02:28:02.7332372Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:02.7338750Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:02.733287] 2025-10-10T02:28:07.3114681Z 2025-10-10T02:28:07.3116804Z test_ops_jit 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_jit_1.2_fd920b3db702ffc2_.log 2025-10-10T02:28:07.3118034Z Running 0 items in this shard: 2025-10-10T02:28:07.3118393Z 2025-10-10T02:28:07.3122670Z Running xpu/test_conv 1/1 ... [2025-10-10 02:28:07.311731] 2025-10-10T02:28:07.3123566Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:07.3128759Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_conv.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:07.312300] 2025-10-10T02:28:10.9862783Z 2025-10-10T02:28:10.9864382Z xpu/test_conv 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_conv_1.1_78871d8272c5db66_.log 2025-10-10T02:28:10.9865657Z Running 0 items in this shard: 2025-10-10T02:28:10.9866004Z 2025-10-10T02:28:14.3578096Z Running inductor/test_triton_cpu_backend 1/1 ... [2025-10-10 02:28:14.356707] 2025-10-10T02:28:14.3579241Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.3580471Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_cpu_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.357314] 2025-10-10T02:28:14.5395772Z Running dynamo/test_torchrec 1/1 ... [2025-10-10 02:28:14.538332] 2025-10-10T02:28:14.5396674Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.5398726Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_torchrec.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.538997] 2025-10-10T02:28:14.5847928Z Running test_ops 7/9 ... [2025-10-10 02:28:14.584001] 2025-10-10T02:28:14.5848845Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.5851889Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=7', '--num-shards=9', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.584704] 2025-10-10T02:28:14.6169815Z Running test_cuda_expandable_segments 1/1 ... [2025-10-10 02:28:14.616406] 2025-10-10T02:28:14.6170760Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.6173890Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.616904] 2025-10-10T02:28:14.6256065Z Running test_decomp 2/17 ... [2025-10-10 02:28:14.624964] 2025-10-10T02:28:14.6256980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.6261041Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=2', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.625425] 2025-10-10T02:28:14.6692922Z Running test_decomp 3/17 ... [2025-10-10 02:28:14.668734] 2025-10-10T02:28:14.6693717Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.6696619Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=3', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.669220] 2025-10-10T02:28:14.6729095Z Running test_decomp 14/17 ... [2025-10-10 02:28:14.672386] 2025-10-10T02:28:14.6729900Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.6734453Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=14', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.672909] 2025-10-10T02:28:14.7002483Z Running test_decomp 15/17 ... [2025-10-10 02:28:14.699580] 2025-10-10T02:28:14.7003295Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:14.7007368Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=15', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:14.700165] 2025-10-10T02:28:18.0204843Z 2025-10-10T02:28:18.0206436Z dynamo/test_torchrec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_torchrec_1.1_c7ac29379933181b_.log 2025-10-10T02:28:18.0212058Z Running 0 items in this shard: 2025-10-10T02:28:18.0212395Z 2025-10-10T02:28:18.0225136Z Running test_jit_fuser_te 2/2 ... [2025-10-10 02:28:18.020650] 2025-10-10T02:28:18.0225691Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:18.0227322Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_fuser_te.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:18.021038] 2025-10-10T02:28:19.4873410Z 2025-10-10T02:28:19.4874783Z test_cuda_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_expandable_segments_1.1_92284354314c6193_.log 2025-10-10T02:28:19.4875451Z 2025-10-10T02:28:19.4875636Z Running test_nestedtensor 1/3 ... [2025-10-10 02:28:19.486859] 2025-10-10T02:28:19.4876018Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:19.4882257Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:19.487529] 2025-10-10T02:28:21.4427831Z 2025-10-10T02:28:21.4429050Z inductor/test_triton_cpu_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_cpu_backend_1.1_16298a6f1221d464_.log 2025-10-10T02:28:21.4429846Z 2025-10-10T02:28:21.4430121Z Running profiler/test_execution_trace 1/1 ... [2025-10-10 02:28:21.442688] 2025-10-10T02:28:21.4430607Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:21.4435856Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_execution_trace.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:21.443211] 2025-10-10T02:28:31.1837613Z 2025-10-10T02:28:31.1839698Z profiler/test_execution_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_execution_trace_1.1_b3fba3529d0e4596_.log 2025-10-10T02:28:31.1852200Z Running 13 items in this shard: test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_alone_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_disabled_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_enabled_with_kineto_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_enabled_with_pt2_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_nested_tensor_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_no_capture_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_record_integral_tensor_data_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_record_integral_tensor_range_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_repeat_in_loop_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_start_stop_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_with_kineto_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_with_pt2_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_triton_fx_graph_with_et_cuda 2025-10-10T02:28:31.1859474Z 2025-10-10T02:28:31.1859915Z Running profiler/test_record_function 1/1 ... [2025-10-10 02:28:31.183733] 2025-10-10T02:28:31.1860732Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:31.1862895Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_record_function.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:31.184383] 2025-10-10T02:28:34.8109756Z 2025-10-10T02:28:34.8111397Z profiler/test_record_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_record_function_1.1_819be13532528fe4_.log 2025-10-10T02:28:34.8117050Z Running 6 items in this shard: test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_delegation_with_profiler, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function_fork, test/profiler/test_record_function.py::TestRecordFunction::test_python_dispatch_mode_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_python_subclass_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_record_function 2025-10-10T02:28:34.8121248Z 2025-10-10T02:28:34.8121652Z Running test_sparse_semi_structured 1/1 ... [2025-10-10 02:28:34.810441] 2025-10-10T02:28:34.8122435Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:34.8124343Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_semi_structured.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:34.811087] 2025-10-10T02:28:41.7907361Z 2025-10-10T02:28:41.7908599Z test_sparse_semi_structured 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_semi_structured_1.1_43779d8d7439c023_.log 2025-10-10T02:28:41.7930753Z Running 42 items in this shard: test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cusparselt, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cutlass, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_sp24_compile, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_indices, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_linear, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_min_sparse_shape, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mlp, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mm_sparse_first_NN, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mm_sparse_first_NT, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mm_sparse_first_TN, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mm_sparse_second_NN, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_mm_sparse_second_NT, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_to_sparse_semi_structured, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_unsupported_dim, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_unsupported_dtype, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_unsupported_shape, test/test_sparse_semi_structured.py::TestSparseSemiStructured::test_values, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_gemm, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_pack_both_ways_edge_case1, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_pack_both_ways_id, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_pack_both_ways_meta_correctness, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_prune_dense_static_sort, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_pruning_algo_largest_abs_values_greedy, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_sp24_apply, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_sp24_apply_dense, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_sp24_matmuls, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_sp24_matmuls_bmm, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTraining::test_sp24_matmuls_mat_vec, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASS::test_conversions, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASS::test_conversions_all_patterns, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASS::test_linear_cutlass, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASS::test_sparse_semi_structured_ops_cutlass, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cslt_sparse_mm_alpha, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cslt_sparse_mm_alpha_compile_autotune, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cslt_sparse_mm_alpha_mixed_dtype, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cslt_sparse_mm_mixed_dtype, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cslt_sparse_mm_search, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_csrc_cslt_sparse_mm_search, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_cusparselt_backend, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_sparse_fp8fp8_mm, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_sparse_semi_structured_scaled_mm, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELT::test_sparse_semi_structured_scaled_mm_fp8 2025-10-10T02:28:41.7945037Z 2025-10-10T02:28:41.7945288Z Running functorch/test_aot_joint_with_descriptors 1/1 ... [2025-10-10 02:28:41.790906] 2025-10-10T02:28:41.7945722Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:41.7946721Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aot_joint_with_descriptors.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:41.791847] 2025-10-10T02:28:51.5966812Z 2025-10-10T02:28:51.5967999Z functorch/test_aot_joint_with_descriptors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_0b7f286d1dd6e286_.log 2025-10-10T02:28:51.5974492Z Running 11 items in this shard: test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_conv_bn_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_export_and_compile, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_conv_bn_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_multiple_outputs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_node_consistency, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_simple_linear, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_in_out_specs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_module_with_kwargs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_multiple_outputs_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_preserve_annotate_simple, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_simple_linear_module 2025-10-10T02:28:51.5980818Z 2025-10-10T02:28:51.5981100Z Running functorch/test_eager_transforms 1/1 ... [2025-10-10 02:28:51.596646] 2025-10-10T02:28:51.5981603Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:28:51.5982944Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:28:51.597293] 2025-10-10T02:29:13.1192922Z 2025-10-10T02:29:13.1193925Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_b251301dab30f000_.log 2025-10-10T02:29:13.1350103Z Running 355 items in this shard: test/functorch/test_eager_transforms.py::TestSliceArgnums::test_argnums_reorders, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_duplicate_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_negative_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_positive_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_tuple_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_invalid_argnum_type, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_not_enough_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_out_of_bounds_argnum_values, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_pytree_args, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_buffer_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_ensemble, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_grad, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_leaf, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_mismatch_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_advanced_indexing_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composed_with_autograd_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_two_ops_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_conj_bit_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_dtype_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_ignored_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_marked_as_dead_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_fn_with_kwargs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_with_buffers_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_base_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_invalid_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_is_cuda_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_manual_seed_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_negative_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_nesting_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_fn_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_only_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_value_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_numel_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_out_of_order_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_primitive_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_print_captured_tensor_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_shape_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_ctor_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_hessian_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_view_inplace_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_views_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_error_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_input_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_output_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_two_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_zero_grad_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_log_softmax_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_empty_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_zeros_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_complex_error_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_hessian_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_autograd_function_disables_fwd_grad_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inputs_are_tuples_of_tensors_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_inside_autograd_function_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_new_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_nonempty_primals_and_tangents_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_primals_tangents_length_mismatch_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_error_cases_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_simple_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_strict_mode_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_input_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_output_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_zerotensor_vmapjvp_interaction_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_basic_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_grad_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_vmap_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_errors_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_nested_input_nested_output_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_return_cuda_float32, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_view_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_no_view_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_base_prop_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_view_prop_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_multi_input_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_simple_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_unrelated_outputs_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_jacfwd_different_levels_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacfwd_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacrev_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jvp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_vjp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_functionalize_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_grad_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_vmap_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_jvp_supports_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_no_warning_on_import_functorch_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_requires_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_retain_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_doesnt_support_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacrev_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_basic_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_functional_call_multiple_dicts_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_name_wrapping_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_outside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_sum_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fake_tensors_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_multi_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_reapply_views_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_transpose_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_grad_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_nonfunctional_output_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_opt_tensor_list_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist1_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist2_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_linear_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_inplace_slice_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_resize_program_inputs_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_simple_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_vmap_functionalize_jvp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_grad_fn_name_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_needs_input_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_autograd_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_set_materialize_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_has_vmap_staticmethod_and_has_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_single_input_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_incompatible_out_dims_error_msg_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_info_object_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_kwarg_only_tensors_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_no_vmap_staticmethod_and_no_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_none_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_should_have_two_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_skips_empty_layer_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_error_if_name_collision_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_nesting_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_overrides_saved_tensors_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_passthrough_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_debug_unwrap_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_reductify_leaf_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_compile_vmap_hessian_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_grad_deprecated_api_cuda 2025-10-10T02:29:13.1504862Z 2025-10-10T02:29:13.1505076Z Running functorch/test_vmap 1/1 ... [2025-10-10 02:29:13.120064] 2025-10-10T02:29:13.1505517Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:29:13.1506799Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:29:13.120542] 2025-10-10T02:32:24.8912581Z 2025-10-10T02:32:24.8913769Z test_jit_fuser_te 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_fuser_te_2.2_7be947a3c4e73698_.log 2025-10-10T02:32:24.9986091Z Running 3383 items in this shard: test/test_jit_fuser_te.py::TestTEFuserStatic::test_abs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_add_bool, test/test_jit_fuser_te.py::TestTEFuserStatic::test_addcmul, test/test_jit_fuser_te.py::TestTEFuserStatic::test_arg_configurations_smoke, test/test_jit_fuser_te.py::TestTEFuserStatic::test_autocast_up, test/test_jit_fuser_te.py::TestTEFuserStatic::test_batch_norm, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_pow, test/test_jit_fuser_te.py::TestTEFuserStatic::test_channels_last_dims_dynamic, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_distributes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_motion_deduplicates_inputs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_mul_one, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_multiple, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp_double, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp_int, test/test_jit_fuser_te.py::TestTEFuserStatic::test_concat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_conv2d_depthwise, test/test_jit_fuser_te.py::TestTEFuserStatic::test_disabled, test/test_jit_fuser_te.py::TestTEFuserStatic::test_div_bool, test/test_jit_fuser_te.py::TestTEFuserStatic::test_dynamic_shapes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_eq_unsqueeze_type_as, test/test_jit_fuser_te.py::TestTEFuserStatic::test_erf, test/test_jit_fuser_te.py::TestTEFuserStatic::test_exhaust_specializations, test/test_jit_fuser_te.py::TestTEFuserStatic::test_fusion_reuse_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_hardsigmoid_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserStatic::test_hardswish_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserStatic::test_inlined_optimized_graph, test/test_jit_fuser_te.py::TestTEFuserStatic::test_kernel_cache_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lerp, test/test_jit_fuser_te.py::TestTEFuserStatic::test_list_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm_concat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm_traced, test/test_jit_fuser_te.py::TestTEFuserStatic::test_matmul, test/test_jit_fuser_te.py::TestTEFuserStatic::test_minmax, test/test_jit_fuser_te.py::TestTEFuserStatic::test_minmax_int_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_neg_pow, test/test_jit_fuser_te.py::TestTEFuserStatic::test_pow_multiple_dtype, test/test_jit_fuser_te.py::TestTEFuserStatic::test_profiler, test/test_jit_fuser_te.py::TestTEFuserStatic::test_rand_broadcast_cuda, test/test_jit_fuser_te.py::TestTEFuserStatic::test_skip_grad_in_check, test/test_jit_fuser_te.py::TestTEFuserStatic::test_sum_dim, test/test_jit_fuser_te.py::TestTEFuserStatic::test_superslomo, test/test_jit_fuser_te.py::TestTEFuserStatic::test_ternary_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_to_device, test/test_jit_fuser_te.py::TestTEFuserStatic::test_to_dtype, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unary_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unrolled_cat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_where_and_typing, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_adaptive_avg_pool2d, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_arg_configurations_smoke, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_autocast_up, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_tensor_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_cat_graph_opt, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_correctness, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_distributes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_mul_one, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_multiple, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_comparison_eq_ne, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_comparison_gt_lt, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_concat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_concat_invariant, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_conv2d_depthwise, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_cuda_half, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_disabled, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_div_bool, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_dynamic_shapes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_exhaust_specializations, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_exp, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_fusion_reuse_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_hardsigmoid_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_inlined_optimized_graph, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_kernel_cache_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lerp, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_minmax, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_mul_bool, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_neg_pow, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_pow_multiple_dtype, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_relu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_scalar_arg, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_scalar_only_inputs, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_sum_dim, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_ternary_norm_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_threshold, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_to_dtype, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_torch_to, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_type_as_cat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unary_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unrolled_cat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unsqueeze_size_calculation, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_where_and_typing, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_with_strict_fusion, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_failures_matmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__native_batch_norm_legit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__softmax_backward_data_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_shapes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e5m2, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e5m2fnuz, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_frexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_uint32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igamma_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_imag_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_imag_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eig_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_grad_oriented_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_hermitian_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_singular_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_qr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_slogdet_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svdvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_normal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logdet_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_log_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_multinomial_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanquantile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanquantile_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_celu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_similarity_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_ctc_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_ctc_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_elu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_glu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardsigmoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_huber_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_area_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_leaky_relu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mish_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pdist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_prelu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_complex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ormqr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polar_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polar_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_bartlett_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_bartlett_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_blackman_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_exponential_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_hamming_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hamming_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hann_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triangular_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_indices_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_complex_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_real_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_real_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported___getitem___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__native_batch_norm_legit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__segment_reduce_offsets_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__softmax_backward_data_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_acosh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_alias_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_all_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_allclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_aminmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_angle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_any_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argsort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_asinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_atleast_3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bernoulli_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_block_diag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_broadcast_shapes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cartesian_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cdouble_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_chalf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cholesky_inverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cholesky_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_clone_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_column_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_complex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_constant_pad_nd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cummax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cumulative_trapezoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_deg2rad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diag_embed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagonal_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_digamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_einsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_erfinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_expand_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_eye_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_hfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_hfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ihfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ihfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_rfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_flip_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_flipud_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_frexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_full_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_gradient_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_grid_sampler_2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_grid_sampler_3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_hstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_igamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isreal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_4inputs_with_extra_args_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_unary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ldexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cholesky_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cond_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eig_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eigh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_inv_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_ldl_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lstsq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lu_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_rank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_rank_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_multi_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_norm_subgradients_at_zero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_pinv_singular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_svdvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_tensorsolve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_softmax_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logical_and_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mH_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mT_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_argmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_meshgrid_variadic_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_min_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_minimum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_movedim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nan_to_num_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nanmean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nanmedian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_native_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_native_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_full_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nextafter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_alpha_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_avg_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_avg_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_ctc_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_dropout3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_embedding_bag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_embedding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_fractional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_glu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_hinge_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_area_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_trilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_kl_div_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_margin_ranking_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool2d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_mse_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_multi_head_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pixel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pixel_unshuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_selu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_silu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softmin_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_triplet_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_upsample_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_norm_fro_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_normal_in_place_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_normal_number_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ones_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ormqr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_permute_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_pinverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polar_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_4_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_positive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_rand_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randint_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randn_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ravel_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_renorm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_repeat_interleave_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resize_as__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resolve_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resolve_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_roll_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_round_decimals_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_round_decimals_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scalar_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_blackman_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_gaussian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_hann_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signbit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sinc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_slice_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_slice_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sparse_sampled_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_airy_ai_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_bessel_j1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_entr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_erfcx_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_hermite_polynomial_h_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_hermite_polynomial_he_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_log_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_modified_bessel_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_spherical_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_zeta_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_square_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_multiple_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_stft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sum_to_size_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_svd_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tensor_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tensordot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_to_sparse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_trapz_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tril_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unbind_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unbind_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unique_consecutive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unsafe_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unsqueeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_vdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_xlogy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___rdiv___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___rmul___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_abs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_acos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_addcmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_addmm_decomposed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_asin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_atan2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_atan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_bool_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ceil_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_cos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_floor_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_no_rounding_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_trunc_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_double_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_erfc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_expand_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_expm1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_floor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_fmod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ge_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_isnan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_le_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_lgamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_log_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_long_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_masked_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_mm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ne_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_hardswish_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_softplus_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_tanhshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_threshold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reciprocal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_remainder_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reshape_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reshape_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_round_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_rsub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_tanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_transpose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_trunc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_where_cuda_float32 2025-10-10T02:32:25.0802956Z 2025-10-10T02:32:25.0803155Z Running functorch/test_control_flow 4/5 ... [2025-10-10 02:32:24.899253] 2025-10-10T02:32:25.0803493Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:32:25.0804293Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_control_flow.py', '-m', 'not serial', '--shard-id=4', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:32:24.899962] 2025-10-10T02:34:58.0459481Z 2025-10-10T02:34:58.0465089Z functorch/test_vmap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_1.1_c619f8ccbd658e58_.log 2025-10-10T02:34:58.1540940Z Running 2136 items in this shard: test/functorch/test_vmap.py::TestVmapAPI::test_accepts_nested_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_backward_unsupported_interaction, test/functorch/test_vmap.py::TestVmapAPI::test_batch_rule_does_not_need_to_handle_no_batched_input, test/functorch/test_vmap.py::TestVmapAPI::test_batched_gradient_basic, test/functorch/test_vmap.py::TestVmapAPI::test_checkpoint, test/functorch/test_vmap.py::TestVmapAPI::test_constant_function, test/functorch/test_vmap.py::TestVmapAPI::test_data_attribute, test/functorch/test_vmap.py::TestVmapAPI::test_data_dependent_control_flow_throws, test/functorch/test_vmap.py::TestVmapAPI::test_decomposition_under_python_dispatcher, test/functorch/test_vmap.py::TestVmapAPI::test_different_map_dim_size_raises, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_does_not_warn_by_default, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_masked_fill, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_multiple_returns, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_warning, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_warns_when_warnings_are_enabled, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_with_undefined_grad, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_zero_dim, test/functorch/test_vmap.py::TestVmapAPI::test_func_with_no_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_func_with_no_tensors, test/functorch/test_vmap.py::TestVmapAPI::test_functools_partial, test/functorch/test_vmap.py::TestVmapAPI::test_grad_unsupported_interaction, test/functorch/test_vmap.py::TestVmapAPI::test_in_dim_not_in_tensor_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_in_dims_wrong_type_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_nary_different_levels, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_nary_same_levels, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_unary, test/functorch/test_vmap.py::TestVmapAPI::test_integer_in_dim_but_not_tensor_input_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_item_throws, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_outputs, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_outputs2, test/functorch/test_vmap.py::TestVmapAPI::test_nested_negative_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_non_default_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_diag_embed, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_different_map_dim, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_same_map_dim, test/functorch/test_vmap.py::TestVmapAPI::test_nn_module, test/functorch/test_vmap.py::TestVmapAPI::test_non_default_in_dims_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_non_tensor_output_raises, test/functorch/test_vmap.py::TestVmapAPI::test_non_zero_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_none_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nonzero_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_noop_in_inner_vmap, test/functorch/test_vmap.py::TestVmapAPI::test_not_enough_in_dims_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dim_out_of_bounds_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_and_num_outputs_mismatch_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_edge_case, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_must_be_int_or_collection_of_int_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_none, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_none_tuple, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_normal_tensor, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_odict_returns, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_broadcast_nested, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_broadcast_simple, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_outdims, test/functorch/test_vmap.py::TestVmapAPI::test_reshape_dim_into, test/functorch/test_vmap.py::TestVmapAPI::test_reshape_dim_outof, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_no_vmapped_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_pytree_input_output, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_unexpanded_outputs, test/functorch/test_vmap.py::TestVmapAPI::test_single_input, test/functorch/test_vmap.py::TestVmapAPI::test_unsupported_op_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_vmap_autocast_cpu, test/functorch/test_vmap.py::TestVmapAPI::test_vmap_autocast_cuda, test/functorch/test_vmap.py::TestVmapOperators::test_T_numpy, test/functorch/test_vmap.py::TestVmapOperators::test_adaptive_avg_pool2d, test/functorch/test_vmap.py::TestVmapOperators::test_argmax_dim, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_add, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_add_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_div, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_div_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_mul, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_mul_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_pow, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_pow_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_sub, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_sub_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_as_strided, test/functorch/test_vmap.py::TestVmapOperators::test_bmm, test/functorch/test_vmap.py::TestVmapOperators::test_cat, test/functorch/test_vmap.py::TestVmapOperators::test_chunk, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_clamp, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_inplace_variant_clamp_max_, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_inplace_variant_clamp_min_, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_variant_clamp_max, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_variant_clamp_min, test/functorch/test_vmap.py::TestVmapOperators::test_clone, test/functorch/test_vmap.py::TestVmapOperators::test_comparison_ops, test/functorch/test_vmap.py::TestVmapOperators::test_conj, test/functorch/test_vmap.py::TestVmapOperators::test_conj_bit, test/functorch/test_vmap.py::TestVmapOperators::test_contiguous, test/functorch/test_vmap.py::TestVmapOperators::test_conv2d, test/functorch/test_vmap.py::TestVmapOperators::test_copy_, test/functorch/test_vmap.py::TestVmapOperators::test_cross_batch_size_three, test/functorch/test_vmap.py::TestVmapOperators::test_diagonal, test/functorch/test_vmap.py::TestVmapOperators::test_dot, test/functorch/test_vmap.py::TestVmapOperators::test_expand_as, test/functorch/test_vmap.py::TestVmapOperators::test_fill_and_zero_inplace, test/functorch/test_vmap.py::TestVmapOperators::test_imag, test/functorch/test_vmap.py::TestVmapOperators::test_is_complex, test/functorch/test_vmap.py::TestVmapOperators::test_is_contiguous, test/functorch/test_vmap.py::TestVmapOperators::test_is_floating_point, test/functorch/test_vmap.py::TestVmapOperators::test_mean, test/functorch/test_vmap.py::TestVmapOperators::test_mean_dim, test/functorch/test_vmap.py::TestVmapOperators::test_mm, test/functorch/test_vmap.py::TestVmapOperators::test_mode_key, test/functorch/test_vmap.py::TestVmapOperators::test_movedim, test/functorch/test_vmap.py::TestVmapOperators::test_mv, test/functorch/test_vmap.py::TestVmapOperators::test_narrow, test/functorch/test_vmap.py::TestVmapOperators::test_new_empty, test/functorch/test_vmap.py::TestVmapOperators::test_new_empty_strided, test/functorch/test_vmap.py::TestVmapOperators::test_new_zeros, test/functorch/test_vmap.py::TestVmapOperators::test_nll_loss, test/functorch/test_vmap.py::TestVmapOperators::test_one_hot, test/functorch/test_vmap.py::TestVmapOperators::test_real, test/functorch/test_vmap.py::TestVmapOperators::test_repeat, test/functorch/test_vmap.py::TestVmapOperators::test_reshape, test/functorch/test_vmap.py::TestVmapOperators::test_reshape_as, test/functorch/test_vmap.py::TestVmapOperators::test_result_type, test/functorch/test_vmap.py::TestVmapOperators::test_roll_no_dims, test/functorch/test_vmap.py::TestVmapOperators::test_select, test/functorch/test_vmap.py::TestVmapOperators::test_silu_backward, test/functorch/test_vmap.py::TestVmapOperators::test_slice, test/functorch/test_vmap.py::TestVmapOperators::test_slogdet, test/functorch/test_vmap.py::TestVmapOperators::test_split, test/functorch/test_vmap.py::TestVmapOperators::test_squeeze, test/functorch/test_vmap.py::TestVmapOperators::test_stack, test/functorch/test_vmap.py::TestVmapOperators::test_stride, test/functorch/test_vmap.py::TestVmapOperators::test_sum, test/functorch/test_vmap.py::TestVmapOperators::test_sum_dim, test/functorch/test_vmap.py::TestVmapOperators::test_t, test/functorch/test_vmap.py::TestVmapOperators::test_tensor_split, test/functorch/test_vmap.py::TestVmapOperators::test_to, test/functorch/test_vmap.py::TestVmapOperators::test_trace, test/functorch/test_vmap.py::TestVmapOperators::test_transpose, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_abs, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_acos, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_asin, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_atan, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_ceil, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_cos, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_cosh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_digamma, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_exp, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_expm1, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_floor, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_frac, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_lgamma, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log10, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log1p, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log2, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_neg, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_reciprocal, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_relu, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_round, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_rsqrt, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sigmoid, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sign, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sin, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sinh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sqrt, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_tan, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_tanh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_trunc, test/functorch/test_vmap.py::TestVmapOperators::test_unbind, test/functorch/test_vmap.py::TestVmapOperators::test_unfold, test/functorch/test_vmap.py::TestVmapOperators::test_unsafe_view, test/functorch/test_vmap.py::TestVmapOperators::test_unsqueeze, test/functorch/test_vmap.py::TestVmapOperators::test_view, test/functorch/test_vmap.py::TestVmapOperators::test_view_as, test/functorch/test_vmap.py::TestVmapOperators::test_view_as_complex, test/functorch/test_vmap.py::TestVmapOperators::test_view_as_real, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapOperators::test_weird_matmul_case, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_0d_tensor_index_put_inplace_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_0d_tensor_index_put_inplace_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_advanced_indexing_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_False_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_False_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_True_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_True_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_False_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_False_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_True_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_True_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_conv_double_backward_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_fill__Tensor_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_flatten_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_foo_like_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_group_norm_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_index_fill_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_index_put_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_inplace_on_view_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_isinf_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_isnan_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_linalg_eigh_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_linalg_svd_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_namedtuple_returns_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_nested_advanced_indexing_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_H_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCatCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulScalarCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyNMSCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyNonzeroCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySortAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySortCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySplitCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySplitCopyWithIntCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyTakeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyViewCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_T_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___getitem___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___getitem___functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___radd___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rand___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rdiv___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmatmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmod___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___ror___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rpow___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rsub___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rxor___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__batch_norm_with_update_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__chunk_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__segment_reduce_lengths_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__segment_reduce_offsets_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__softmax_backward_data_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_abs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_acos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_acosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addcdiv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addcmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmm_decomposed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_all_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_allclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_aminmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_angle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_any_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_arange_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argsort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argwhere_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_partial_views_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_asin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_asinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atan2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bfloat16_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bincount_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_and_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_left_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_not_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_or_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_right_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_xor_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_block_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bool_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_shapes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bucketize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_byte_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cartesian_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cauchy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cdouble_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ceil_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cfloat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_chalf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_char_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_char_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_inverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_max_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_min_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clone_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_column_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_combinations_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_conj_physical_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_constant_pad_nd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_contiguous_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_copysign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_count_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cov_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cummax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cummin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumulative_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diag_embed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagflat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diff_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_digamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_floor_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_no_rounding_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_double_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_double_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_einsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_eq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_equal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erfc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erfinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expm1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_eye_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flip_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fliplr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flipud_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_floor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_frac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_frexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_full_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gather_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gcd_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ge_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_geometric_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_geqrf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gradient_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_grid_sampler_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_half_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_half_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hash_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_heaviside_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_histc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hypot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_igamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_igammac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_imag_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_inner_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_int_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_int_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isfinite_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isnan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isneginf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isposinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isreal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_istft_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_item_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_unary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_kron_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lcm_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ldexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_le_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lerp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lgamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log10_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log1p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logaddexp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_and_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_not_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_or_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_long_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_long_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_unpack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mH_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mT_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_matmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_matrix_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_maximum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_minimum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_movedim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_msort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_multinomial_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanmean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanmedian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanquantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nansum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_narrow_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_narrow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_dropout_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ne_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nextafter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_celu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_ctc_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_grid_sample_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_group_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardswish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_huber_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_instance_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_area_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_kl_div_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_leaky_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_local_response_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_logsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mse_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_one_hot_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_circular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_constant_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_reflect_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_replicate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_prelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_relu6_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_rms_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_rrelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_selu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_silu_complex_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softplus_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softsign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_tanhshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_threshold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nonzero_static_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_fro_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_inf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_in_place_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_number_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ones_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ormqr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_outer_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pca_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_permute_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_permute_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pinverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polar_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_positive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_quantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rad2deg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rand_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randint_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randint_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randn_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ravel_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_real_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reciprocal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_remainder_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_renorm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_repeat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_repeat_interleave_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reshape_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reshape_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resize__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resize_as__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resolve_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resolve_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_roll_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rot90_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_neg_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rsqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rsub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_searchsorted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_select_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sgn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_short_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_short_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_bartlett_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_gaussian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_general_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_general_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_hann_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_kaiser_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_nuttall_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signbit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sinc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_slice_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_slice_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sparse_mm_reduce_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sparse_sampled_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_airy_ai_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_j1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_y0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_y1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_entr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_erfcx_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_hermite_polynomial_h_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_hermite_polynomial_he_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i0e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i1e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_legendre_polynomial_p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_log_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_ndtri_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_spherical_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_xlog1py_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_zeta_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_list_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_with_sizes_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_square_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_stft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sum_to_size_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_svd_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_t_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_take_along_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_take_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tensor_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tensordot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_to_sparse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_topk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch__scaled_mm_cuda_float8_e4m3fn, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__flash_attention_forward_cuda_float16, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_transpose_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_transpose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trapz_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triangular_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tril_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tril_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triu_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_true_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trunc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unbind_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unflatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unfold_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_uniform_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unique_consecutive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unique_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unravel_index_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsafe_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsafe_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsqueeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_real_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_where_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_xlogy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zero__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zeros_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_searchsorted_bucketize_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_slogdet_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_sum_scalar_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_torch_return_types_returns_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_escaped_error_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_H_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCatCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulScalarCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyNMSCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyNonzeroCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySortAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySortCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySplitCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySplitCopyWithIntCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyTakeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyViewCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SelectAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_T_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___getitem___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___getitem___functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___radd___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rand___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rdiv___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmatmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmod___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___ror___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rpow___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rsub___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rxor___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__batch_norm_with_update_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__chunk_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__native_batch_norm_legit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__segment_reduce_lengths_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__segment_reduce_offsets_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__softmax_backward_data_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__unsafe_masked_index_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_abs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_acos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_acosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addcdiv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addcmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmm_decomposed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_alias_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_all_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_allclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_aminmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_angle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_any_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_arange_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argsort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argwhere_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_partial_views_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_asin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_asinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atan2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_baddbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bernoulli_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bfloat16_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bincount_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_and_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_left_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_not_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_or_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_right_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_xor_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_block_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bool_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_shapes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bucketize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_byte_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cartesian_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cauchy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cdouble_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ceil_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cfloat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_chalf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_char_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_char_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_inverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_max_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_min_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clone_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_column_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_combinations_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_conj_physical_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_constant_pad_nd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_contiguous_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_copysign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_corrcoef_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_count_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cov_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cummax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cummin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumulative_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_deg2rad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diag_embed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagflat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diff_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_digamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_floor_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_no_rounding_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_trunc_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_double_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_double_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_einsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_permuted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_eq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_equal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erfc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erfinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expm1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_eye_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flip_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fliplr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flipud_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_floor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_floor_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_frac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_frexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_full_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gather_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gcd_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ge_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_geometric_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_geqrf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gradient_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_grid_sampler_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_grid_sampler_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_half_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_half_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hash_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_heaviside_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_histc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hypot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_igamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_igammac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_imag_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_inner_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_int_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_int_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isfinite_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isnan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isneginf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isposinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isreal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_istft_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_item_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_unary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_kron_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_kthvalue_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lcm_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ldexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_le_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lerp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lgamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log10_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log1p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logaddexp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logcumsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_and_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_not_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_or_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_xor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_long_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_long_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_unpack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mH_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mT_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_matmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_matrix_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_maximum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_minimum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_movedim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_msort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_multinomial_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nan_to_num_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanmean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanmedian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanquantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nansum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_narrow_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_narrow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_dropout_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ne_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nextafter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_celu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_ctc_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_elu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_bag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_gelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_glu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_grid_sample_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_group_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardswish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardtanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_huber_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_instance_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_area_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_kl_div_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_leaky_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_local_response_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_logsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mse_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_one_hot_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_circular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_constant_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_reflect_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_replicate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_prelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_relu6_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_rms_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_rrelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_selu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_silu_complex_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_silu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softplus_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softsign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_tanhshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_threshold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nonzero_static_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_fro_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_inf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_nuc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_in_place_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_number_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ones_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ormqr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_outer_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pca_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_permute_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_permute_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pinverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polar_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_positive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_quantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rad2deg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rand_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randint_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randint_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randn_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ravel_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_real_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reciprocal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_remainder_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_renorm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_repeat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_repeat_interleave_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reshape_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reshape_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resize__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resize_as__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resolve_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resolve_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_roll_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rot90_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_neg_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rsqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rsub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scalar_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_searchsorted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_select_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sgn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_short_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_short_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_bartlett_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_blackman_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_gaussian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_general_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_general_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_hann_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_kaiser_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_nuttall_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signbit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sinc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_slice_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_slice_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sparse_mm_reduce_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sparse_sampled_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_airy_ai_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_j1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_y0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_y1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_entr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_erfcx_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_hermite_polynomial_h_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_hermite_polynomial_he_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i0e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i1e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_legendre_polynomial_p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_log_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_ndtri_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_spherical_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_xlog1py_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_zeta_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_list_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_with_sizes_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_with_sizes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_square_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_multiple_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_stft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sum_to_size_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_svd_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_t_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_take_along_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_take_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tensor_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tensordot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_to_sparse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_topk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch__scaled_mm_cuda_float8_e4m3fn, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__flash_attention_forward_cuda_float16, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_transpose_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_transpose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trapz_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triangular_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tril_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tril_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triu_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_true_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trunc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unbind_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unbind_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unflatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unfold_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_uniform_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unique_consecutive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unique_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unravel_index_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsafe_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsafe_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsqueeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsqueeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_real_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_where_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_xlogy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zero__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zeros_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_multi_dot_failure_1D_input_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_with_anomaly_detection_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_add_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_binary_cross_entropy_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_diagonal_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_div_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_expand_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_index_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_inplace_manyview_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_inplace_view_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_lgamma_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log1p_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log_softmax_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_logsumexp_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_max_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_median_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_min_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_mul_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_permute_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_reshape_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend0_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend1_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend2_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_select_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sigmoid_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_slice_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_stack_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sub_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_threshold_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_trace_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_unrelated_output_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_unrelated_output_multiple_grad_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_where_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_grad_and_value_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_grad_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jacfwd_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jacrev_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jvp_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_vjp_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_vmap_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_different_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_error_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_same_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_different_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_different_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_error_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_error_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_same_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_same_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_jacfwd_with_random_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_different_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_different_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_error_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_error_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_same_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_same_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_unsupported_random_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_2_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test__is_all_true_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test__is_any_true_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_check_tensor_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_cat_batching_rule_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_nt_and_batched_dense_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_nt_and_unbatched_dense_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_unary_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_with_nt_and_batched_dense_with_nonzero_bdim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_multilevel_vmap_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_acts_as_dense_in_vmap_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_with_nonzero_in_dim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_with_nonzero_out_dim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_shape_call_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_vmap_fallback_check_ok 2025-10-10T02:34:58.2300944Z 2025-10-10T02:34:58.2301118Z Running test_ops_gradients 2/3 ... [2025-10-10 02:34:58.049893] 2025-10-10T02:34:58.2301481Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:34:58.2302411Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'not serial', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:34:58.050290] 2025-10-10T02:34:58.6574126Z 2025-10-10T02:34:58.6575344Z test_nestedtensor 1/3 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.3_7725799f97b2f074_.log 2025-10-10T02:34:58.6840410Z Running 538 items in this shard: test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_copy_, test/test_nestedtensor.py::TestNestedTensor::test_default_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_fill_, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_randn_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_zeros_like, test/test_nestedtensor.py::TestNestedTensor::test_nested_namespace, test/test_nestedtensor.py::TestNestedTensor::test_size, test/test_nestedtensor.py::TestNestedTensor::test_unbind_0, test/test_nestedtensor.py::TestNestedTensor::test_unbind_3, test/test_nestedtensor.py::TestNestedTensor::test_zero_, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_device_checks_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_is_all_true_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_is_any_true_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_int8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_int64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_int8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_int16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_int8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_masked_select_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_share_memory_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_abs_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_cos_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isinf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_relu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_silu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_sqrt_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_tanh__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_tanh_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_abs_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1023_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_256_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_masked_fill_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_fused_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_generates_leaf_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_plus_transpose_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_to_padded_tensor_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_relu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_split_with_sizes_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_unbind_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_autograd_function_with_None_grad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_composite_op_with_custom_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_use_legacy_api_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_contiguous_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_full_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_randint_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_activation_checkpoint_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_fx_trace_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_permute_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_profiler_sequence_nr_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_record_stream_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_flop_counter_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_transposed_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_transposed_inputs_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_equals_2_bad_dim_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_unflatten_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hash_tensor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_hash_tensor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_xlogy_cuda_float32 2025-10-10T02:34:58.7090002Z 2025-10-10T02:34:58.7090211Z Running test_ops_jit 1/2 ... [2025-10-10 02:34:58.658852] 2025-10-10T02:34:58.7090649Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:34:58.7091767Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:34:58.659279] 2025-10-10T02:36:47.1418213Z 2025-10-10T02:36:47.1422585Z test_decomp 2/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_2.17_cd8d9170b440703d_.log 2025-10-10T02:36:47.1590924Z Running 535 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__native_batch_norm_legit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svdvals_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softplus_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_neg_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_gaussian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_mm_reduce_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_real_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_index_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_eval_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_train_mode_cuda_float32, test/test_decomp.py::HasDecompTest::test_aten_core_operators 2025-10-10T02:36:47.1757613Z 2025-10-10T02:36:47.1757812Z Running xpu/test_conv 1/1 ... [2025-10-10 02:36:47.142875] 2025-10-10T02:36:47.1758226Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:36:47.1759333Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_conv.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:36:47.143479] 2025-10-10T02:36:51.0837742Z 2025-10-10T02:36:51.0839023Z xpu/test_conv 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_conv_1.1_9901229d2927e485_.log 2025-10-10T02:36:51.0840263Z Running 0 items in this shard: 2025-10-10T02:36:51.0840615Z 2025-10-10T02:37:21.8898587Z 2025-10-10T02:37:21.8899592Z test_decomp 3/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_3.17_b8eb35a59c73fcdc_.log 2025-10-10T02:37:21.9075573Z Running 547 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__softmax_backward_data_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bincount_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_householder_product_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_triangular_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_prelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_svd_lowrank_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_rsub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_select_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_squeeze_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_huber_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_leaky_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex64 2025-10-10T02:37:21.9245266Z 2025-10-10T02:37:37.6701538Z 2025-10-10T02:37:37.6702990Z test_decomp 14/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_14.17_d15a11e3338a03af_.log 2025-10-10T02:37:37.6994488Z Running 526 items in this shard: test/test_decomp.py::TestDecompCUDA::test_bernoulli_default_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igammac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_triangular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_gaussian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward__unsafe_masked_index_put_accumulate_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float8_e5m2fnuz, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float64, test/test_decomp.py::DecompOneOffTestsCUDA::test_sdpa_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_decomp.py::HasDecompTest::test_conv1d_decomposition 2025-10-10T02:37:37.7231999Z 2025-10-10T02:38:22.7222539Z 2025-10-10T02:38:22.7223472Z test_ops 7/9 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_7.9_5ccc1f897fee9055_.log 2025-10-10T02:38:22.8415162Z Running 3832 items in this shard: test/test_ops.py::TestSelfKwarg::test_self_kwargs, test/test_ops.py::TestCommonCUDA::test_compare_cpu_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_static_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rand_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_randn_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_H_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__batch_norm_with_update_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_round_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_combinations_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cov_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gather_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ldexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logdet_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resolve_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_neg_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_sparse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zero__cuda, test/test_ops.py::TestCommonCUDA::test_errors___rand___cuda, test/test_ops.py::TestCommonCUDA::test_errors_arange_cuda, test/test_ops.py::TestCommonCUDA::test_errors_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cov_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eye_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gather_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lt_cuda, test/test_ops.py::TestCommonCUDA::test_errors_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_errors_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multi_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sub_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_msort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resize__cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zero__cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___getitem___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__unsafe_masked_index_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diff_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_4inputs_with_extra_args_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_unary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_outer_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_3_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_airy_ai_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmod___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_consecutive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_cosine_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning___rand___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___ror___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_arange_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_all_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bincount_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_inverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_grid_sampler_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histogram_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cond_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_with_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_searchsorted_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tensordot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amin_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lt_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_shapes_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vdot_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmatmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_equal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expm1_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cdouble_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_constant_pad_nd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagflat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_geqrf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_imag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isfinite_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_det_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log1p_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_randn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resize__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_to_sparse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapz_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcdiv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flipud_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_repeat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_roll_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_abs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_any_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cartesian_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_conj_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diff_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dist_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_irfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_gather_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_imag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isinf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_kron_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_multi_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log1p_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_neg_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ormqr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_put_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scalar_tensor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapz_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zeros_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view_H_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_acos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hypot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isreal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_reciprocal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_remainder_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sinc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__softmax_backward_data_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_decomposed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_block_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cartesian_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_min_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fliplr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_heaviside_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_igamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_det_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vander_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_pool2d_with_indices_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_minimum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmedian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nansum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_grid_sample_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_instance_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softsign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_nuc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_repeat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_cosine_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_gaussian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_airy_ai_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_erfcx_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_topk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tril_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsqueeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scalar_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randint_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randint_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bfloat16, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_half_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_dstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expm1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gcd_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_igammac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lerp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_permute_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_round_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i1e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tril_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cdouble_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_inverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_permuted_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expm1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ge_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hash_tensor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_histc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_inner_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_long_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mT_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mode_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_embedding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapz_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zero__cuda_float32 2025-10-10T02:38:22.9699057Z 2025-10-10T02:39:04.3044107Z 2025-10-10T02:39:04.3046394Z test_decomp 15/17 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.17_e36e0914d8b94d39_.log 2025-10-10T02:39:04.3327543Z Running 525 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__native_batch_norm_legit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_unpack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_linear_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pdist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_baddbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_squeeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unbind_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32, test/test_decomp.py::DecompOneOffTestsCUDA::test_sdpa_nn_functional_scaled_dot_product_attention_cuda_float16 2025-10-10T02:39:04.3565442Z 2025-10-10T02:40:41.3534426Z 2025-10-10T02:40:41.3539903Z test_ops_gradients 2/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_2.3_455a00346aa48bf5_.log 2025-10-10T02:40:41.4332753Z Running 1797 items in this shard: test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_grid_sampler_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_while_loop_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scan_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_while_loop_stack_output_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_like_cuda_complex128 2025-10-10T02:40:41.4800294Z 2025-10-10T02:40:41.4800498Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T02:40:41.4800848Z Uploading artifacts took 0.00 seconds 2025-10-10T02:41:16.8620814Z 2025-10-10T02:41:16.8622628Z test_ops_jit 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_jit_1.2_5ae4e8761c734768_.log 2025-10-10T02:41:16.8951877Z Running 546 items in this shard: test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_abs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_clamp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_trunc_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erfinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_igamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_le_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_householder_product_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_matmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_rms_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_tanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_H_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___getitem___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rdiv___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__chunk_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcmul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_decomposed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_aminmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_angle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argsort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bernoulli_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_block_diag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bool_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_shapes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bucketize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_byte_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cartesian_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cauchy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdouble_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_char_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_inverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_max_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_column_stack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_combinations_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_physical_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_contiguous_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_copysign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cummax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumulative_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagflat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_double_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_permuted_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_equal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expm1_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exponential_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eye_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flatten_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flip_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fliplr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flipud_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_power_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_grid_sampler_2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_half_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_histc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_igammac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_imag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_inner_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isclose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isfinite_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isinf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_2inputs_2outputs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_unary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kthvalue_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ldexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lerp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cond_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eig_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_householder_product_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_hermitian_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_qr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svdvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vecdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_normal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logcumsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_and_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_not_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_or_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_tensor_overload_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_unpack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_argmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_median_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_sum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_pool2d_with_indices_backward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_no_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_with_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_list_of_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_variadic_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_minimum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_msort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nan_to_num_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanquantile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nansum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_alpha_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_channel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cosine_similarity_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cross_entropy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_elu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_embedding_bag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardsigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardswish_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardtanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_huber_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_instance_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_linear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_local_response_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_circular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_constant_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pdist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_prelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rrelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_selu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_silu_complex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softplus_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softsign_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_tanhshrink_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_threshold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_upsample_nearest_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_static_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_fro_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_in_place_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_number_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ormqr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pinverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polar_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_qr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rad2deg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randint_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_real_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reciprocal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize_as__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_conj_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_roll_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsqrt_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scalar_tensor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_searchsorted_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_short_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sigmoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_bartlett_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_blackman_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_cosine_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_gaussian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_hann_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_kaiser_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_nuttall_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signbit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_mm_reduce_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_airy_ai_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_j0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_y0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_u_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_entr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_erfcx_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_hermite_polynomial_h_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i0e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i1e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_legendre_polynomial_p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_i1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_ndtr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_xlog1py_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sqrt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_square_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_to_size_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensor_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapz_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triangular_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tril_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_true_divide_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unflatten_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_where_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_float32 2025-10-10T02:41:16.9228577Z 2025-10-10T02:41:20.3841643Z 2025-10-10T02:41:20.3842942Z functorch/test_control_flow 4/5 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_control_flow_4.5_b591b384ffa47223_.log 2025-10-10T02:41:20.4279789Z Running 386 items in this shard: test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_inner_fn, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_pytree_input, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_zeros_unused_branch_complex_compile_fail_compile_mode_compile_dynamic_shape_scalar_False, test/functorch/test_control_flow.py::TestControlFlow::test_map_dict_in_out, test/functorch/test_control_flow.py::TestControlFlow::test_map_gpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_associative_scan, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_carry_output_alias, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_compile_mode_eager_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_init_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_additional_inputs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_additional_inputs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_complex_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_complex_reverse_True_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_init_longer_carry, test/functorch/test_control_flow.py::TestControlFlow::test_scan_multiple_layers_gradient_layers_1_device_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_multiple_layers_gradient_layers_3_device_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_simple_graph_wrong_dtype, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_combine_fn_wrong_meta_in_combine_fn, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_combine_mode_generic_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_generic_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_pointwise_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_none_combine_mode_generic_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_generic_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_pointwise_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_pointwise_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_eager_combine_mode_pointwise_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_none_combine_mode_pointwise_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_generic_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_pointwise_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_generic_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_pointwise_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_generic_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_pointwise_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_generic_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_generic_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_pointwise_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_none_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_True_same_direction_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_False_same_direction_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_False_same_direction_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_True_same_direction_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_True_same_direction_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_pointwise_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_none_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_False_cuda_combine_mode_pointwise_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cuda_combine_mode_generic_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cuda_combine_mode_generic_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_False_cpu_combine_mode_generic_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cpu_combine_mode_generic_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cpu_combine_mode_pointwise_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cuda_combine_mode_generic_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cuda_combine_mode_pointwise_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cpu_combine_mode_generic_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cpu_combine_mode_pointwise_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cuda_combine_mode_pointwise_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_False_cpu_combine_mode_generic_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_True_cpu_combine_mode_pointwise_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_generic_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_generic_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_pointwise_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_input_output_alias, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_eager_loop_type_for_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_nested, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_dynamic_shape_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_dynamic_shape_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_compile_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_generic_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_generic_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_pointwise_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_combine_mode_pointwise_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_no_grad_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_no_grad_combine_mode_generic_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_no_grad_combine_mode_generic_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_no_grad_combine_mode_pointwise_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_partial_grad_no_grad_combine_mode_pointwise_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_generic_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_pointwise_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_eager_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_input_mutation_on_true_branch, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_nested_input_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_output_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_multi, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_other_inputs, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_symint_operands_requires_grad_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_traced_not_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_2_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_2_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_consecutive_make_fx_symbolic, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_tensor_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_unbacked_sym_pred, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized_arg_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_unfunc_boolean_tensor_for_nested_map_cond, test/functorch/test_control_flow.py::TestControlFlowTraced::test_raise_error_on_mismatch_type_size_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized_elem_alias, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized_elem_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_autograd_symbolic_dict, test/functorch/test_control_flow.py::TestControlFlowTraced::test_two_hops_not_sharing_code_obj, test/functorch/test_control_flow.py::TestControlFlowTraced::test_vmap_vmap_boolcond_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_pytree_int_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_compile_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_compile_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_export_strict_True_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_compile_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_cpp, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_functorch, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_nested_with_linear, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_GraphModule, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_Tensor, test/functorch/test_control_flow.py::TestHopSchema::test_scan_gen_schema_multiple_inputs, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_SymInt, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_str 2025-10-10T02:41:20.4547461Z 2025-10-10T02:41:21.2873868Z Running test batch 'tests to run' cost 7540.42 seconds 2025-10-10T02:41:22.4471217Z 2025-10-10T02:41:22.4471600Z real 125m45.567s 2025-10-10T02:41:22.4472108Z user 859m47.192s 2025-10-10T02:41:22.4472540Z sys 131m34.078s 2025-10-10T02:41:22.4472960Z + assert_git_not_dirty 2025-10-10T02:41:22.4473648Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-10-10T02:41:22.4474727Z + test_aten 2025-10-10T02:41:22.4475257Z + echo 'Running ATen tests with pytorch lib' 2025-10-10T02:41:22.4475961Z Running ATen tests with pytorch lib 2025-10-10T02:41:22.4476470Z + [[ -n '' ]] 2025-10-10T02:41:22.4476916Z + echo 'Running test with the build folder' 2025-10-10T02:41:22.4477492Z Running test with the build folder 2025-10-10T02:41:22.4478023Z + TEST_BASE_DIR=build/bin 2025-10-10T02:41:22.4479399Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_hip.so build/bin 2025-10-10T02:41:22.4513888Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libcaffe2_nvrtc.so build/bin 2025-10-10T02:41:22.4545333Z + ln -sf '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libmkldnn*' build/bin 2025-10-10T02:41:22.4578991Z + ln -sf '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libnccl*' build/bin 2025-10-10T02:41:22.4612664Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_hip.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so build/bin 2025-10-10T02:41:22.4648482Z + ls build/bin 2025-10-10T02:41:22.4697610Z BackoffTest 2025-10-10T02:41:22.4698110Z CMakeFiles 2025-10-10T02:41:22.4698618Z CTestTestfile.cmake 2025-10-10T02:41:22.4699089Z CppSignature_test 2025-10-10T02:41:22.4699542Z Dict_test 2025-10-10T02:41:22.4700013Z Dimname_test 2025-10-10T02:41:22.4700459Z FileStoreTest 2025-10-10T02:41:22.4700909Z HashStoreTest 2025-10-10T02:41:22.4701368Z IListRef_test 2025-10-10T02:41:22.4701838Z KernelFunction_test 2025-10-10T02:41:22.4702306Z List_test 2025-10-10T02:41:22.4702680Z MaybeOwned_test 2025-10-10T02:41:22.4703071Z NamedTensor_test 2025-10-10T02:41:22.4703519Z ProcessGroupGlooTest 2025-10-10T02:41:22.4703976Z StorageUtils_test 2025-10-10T02:41:22.4704473Z TCPStoreTest 2025-10-10T02:41:22.4704865Z apply_utils_test 2025-10-10T02:41:22.4705256Z atest 2025-10-10T02:41:22.4705639Z backend_fallback_test 2025-10-10T02:41:22.4706467Z basic 2025-10-10T02:41:22.4706823Z broadcast_test 2025-10-10T02:41:22.4707244Z c10_AllocatorConfig_test 2025-10-10T02:41:22.4707700Z c10_ArrayRef_test 2025-10-10T02:41:22.4708091Z c10_Bitset_test 2025-10-10T02:41:22.4708663Z c10_CompileTimeFunctionPointer_test 2025-10-10T02:41:22.4709209Z c10_ConstexprCrc_test 2025-10-10T02:41:22.4709651Z c10_DeadlockDetection_test 2025-10-10T02:41:22.4710113Z c10_DeviceGuard_test 2025-10-10T02:41:22.4710506Z c10_Device_test 2025-10-10T02:41:22.4710929Z c10_DispatchKeySet_test 2025-10-10T02:41:22.4711411Z c10_Enumerate_test 2025-10-10T02:41:22.4711795Z c10_Half_test 2025-10-10T02:41:22.4712197Z c10_InlineDeviceGuard_test 2025-10-10T02:41:22.4712674Z c10_InlineStreamGuard_test 2025-10-10T02:41:22.4713127Z c10_IntrusiveList_test 2025-10-10T02:41:22.4713552Z c10_LeftRight_test 2025-10-10T02:41:22.4713964Z c10_Metaprogramming_test 2025-10-10T02:41:22.4714593Z c10_NetworkFlow_test 2025-10-10T02:41:22.4715002Z c10_Scalar_test 2025-10-10T02:41:22.4715453Z c10_Semaphore_test 2025-10-10T02:41:22.4715862Z c10_SizesAndStrides_test 2025-10-10T02:41:22.4716310Z c10_StreamGuard_test 2025-10-10T02:41:22.4716725Z c10_SymInt_test 2025-10-10T02:41:22.4717431Z c10_Synchronized_test 2025-10-10T02:41:22.4717869Z c10_ThreadLocal_test 2025-10-10T02:41:22.4718275Z c10_TypeIndex_test 2025-10-10T02:41:22.4718667Z c10_TypeList_test 2025-10-10T02:41:22.4719062Z c10_TypeTraits_test 2025-10-10T02:41:22.4719476Z c10_accumulate_test 2025-10-10T02:41:22.4719875Z c10_bfloat16_test 2025-10-10T02:41:22.4720283Z c10_bit_cast_test 2025-10-10T02:41:22.4720702Z c10_complex_math_test 2025-10-10T02:41:22.4721159Z c10_complex_test 2025-10-10T02:41:22.4721547Z c10_cow_test 2025-10-10T02:41:22.4721907Z c10_error_test 2025-10-10T02:41:22.4722315Z c10_exception_test 2025-10-10T02:41:22.4722718Z c10_flags_test 2025-10-10T02:41:22.4723116Z c10_generic_math_test 2025-10-10T02:41:22.4723651Z c10_hip_HIPAssertionsTest_1_var_test 2025-10-10T02:41:22.4724345Z c10_hip_HIPAssertionsTest_catches_stream 2025-10-10T02:41:22.4725192Z c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-10-10T02:41:22.4726061Z c10_hip_HIPAssertionsTest_from_2_processes 2025-10-10T02:41:22.4726942Z c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-10-10T02:41:22.4728000Z c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-10-10T02:41:22.4728792Z c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-10-10T02:41:22.4729413Z c10_hip_HIPTest 2025-10-10T02:41:22.4729837Z c10_intrusive_ptr_benchmark 2025-10-10T02:41:22.4730310Z c10_intrusive_ptr_test 2025-10-10T02:41:22.4730935Z c10_irange_test 2025-10-10T02:41:22.4731328Z c10_lazy_test 2025-10-10T02:41:22.4731699Z c10_logging_test 2025-10-10T02:41:22.4732081Z c10_optional_test 2025-10-10T02:41:22.4732506Z c10_ordered_preserving_dict_test 2025-10-10T02:41:22.4732990Z c10_registry_test 2025-10-10T02:41:22.4733398Z c10_small_vector_test 2025-10-10T02:41:22.4733879Z c10_ssize_test 2025-10-10T02:41:22.4734332Z c10_string_util_test 2025-10-10T02:41:22.4734808Z c10_string_view_test 2025-10-10T02:41:22.4735492Z c10_tempfile_test 2025-10-10T02:41:22.4735940Z c10_typeid_test 2025-10-10T02:41:22.4736364Z cmake_install.cmake 2025-10-10T02:41:22.4736776Z cpu_allocator_test 2025-10-10T02:41:22.4737168Z cpu_generator_test 2025-10-10T02:41:22.4737587Z cpu_profiling_allocator_test 2025-10-10T02:41:22.4738046Z cpu_rng_test 2025-10-10T02:41:22.4738418Z dlconvertor_test 2025-10-10T02:41:22.4738809Z example_allreduce 2025-10-10T02:41:22.4739217Z extension_backend_test 2025-10-10T02:41:22.4739630Z half_test 2025-10-10T02:41:22.4739993Z hip_apply_test 2025-10-10T02:41:22.4740386Z hip_complex_math_test 2025-10-10T02:41:22.4740805Z hip_complex_test 2025-10-10T02:41:22.4741206Z hip_distributions_test 2025-10-10T02:41:22.4741633Z hip_dlconvertor_test 2025-10-10T02:41:22.4742042Z hip_generator_test 2025-10-10T02:41:22.4742431Z hip_half_test 2025-10-10T02:41:22.4742835Z hip_integer_divider_test 2025-10-10T02:41:22.4743473Z hip_optional_test 2025-10-10T02:41:22.4743914Z hip_packedtensoraccessor_test 2025-10-10T02:41:22.4744400Z hip_vectorized_test 2025-10-10T02:41:22.4744818Z inline_container_test 2025-10-10T02:41:22.4745234Z ivalue_test 2025-10-10T02:41:22.4745625Z kernel_function_legacy_test 2025-10-10T02:41:22.4746090Z kernel_function_test 2025-10-10T02:41:22.4746539Z kernel_lambda_legacy_test 2025-10-10T02:41:22.4747035Z kernel_lambda_test 2025-10-10T02:41:22.4747438Z kernel_stackbased_test 2025-10-10T02:41:22.4747852Z lazy_tensor_test 2025-10-10T02:41:22.4748232Z legacy_vmap_test 2025-10-10T02:41:22.4748611Z libc10.so 2025-10-10T02:41:22.4748966Z libc10_hip.so 2025-10-10T02:41:22.4749348Z libcaffe2_nvrtc.so 2025-10-10T02:41:22.4749729Z 'libmkldnn*' 2025-10-10T02:41:22.4750084Z 'libnccl*' 2025-10-10T02:41:22.4750431Z libtorch.so 2025-10-10T02:41:22.4750790Z libtorch_cpu.so 2025-10-10T02:41:22.4751191Z libtorch_global_deps.so 2025-10-10T02:41:22.4751652Z libtorch_hip.so 2025-10-10T02:41:22.4752045Z libtorch_python.so 2025-10-10T02:41:22.4752436Z libtorchbind_test.so 2025-10-10T02:41:22.4752882Z make_boxed_from_unboxed_functor_test 2025-10-10T02:41:22.4753392Z math_kernel_test 2025-10-10T02:41:22.4753956Z memory_format_test 2025-10-10T02:41:22.4754482Z memory_overlapping_test 2025-10-10T02:41:22.4754923Z mobile_memory_cleanup 2025-10-10T02:41:22.4755323Z native_test 2025-10-10T02:41:22.4755685Z op_allowlist_test 2025-10-10T02:41:22.4756080Z op_registration_test 2025-10-10T02:41:22.4756555Z operator_name_test 2025-10-10T02:41:22.4756955Z operators_test 2025-10-10T02:41:22.4757368Z packedtensoraccessor_test 2025-10-10T02:41:22.4757833Z parallel_benchmark 2025-10-10T02:41:22.4758217Z pow_test 2025-10-10T02:41:22.4758565Z protoc 2025-10-10T02:41:22.4758916Z protoc-3.13.0.0 2025-10-10T02:41:22.4759294Z quantized_test 2025-10-10T02:41:22.4759675Z reduce_ops_test 2025-10-10T02:41:22.4760087Z reportMemoryUsage_test 2025-10-10T02:41:22.4760525Z scalar_tensor_test 2025-10-10T02:41:22.4760908Z scalar_test 2025-10-10T02:41:22.4761280Z static_runtime_bench 2025-10-10T02:41:22.4761692Z static_runtime_test 2025-10-10T02:41:22.4762111Z stride_properties_test 2025-10-10T02:41:22.4762540Z tensor_iterator_test 2025-10-10T02:41:22.4762937Z test_api 2025-10-10T02:41:22.4763285Z test_cpp_rpc 2025-10-10T02:41:22.4763653Z test_dist_autograd 2025-10-10T02:41:22.4764026Z test_jit 2025-10-10T02:41:22.4764363Z test_lazy 2025-10-10T02:41:22.4764716Z test_parallel 2025-10-10T02:41:22.4765092Z thread_init_test 2025-10-10T02:41:22.4765475Z torch_shm_manager 2025-10-10T02:41:22.4765918Z type_ptr_test 2025-10-10T02:41:22.4766570Z type_test 2025-10-10T02:41:22.4767011Z undefined_tensor_test 2025-10-10T02:41:22.4767516Z vec_test_all_types_AVX2 2025-10-10T02:41:22.4768040Z vec_test_all_types_AVX512 2025-10-10T02:41:22.4768577Z vec_test_all_types_DEFAULT 2025-10-10T02:41:22.4769106Z verify_api_visibility 2025-10-10T02:41:22.4769589Z weakref_test 2025-10-10T02:41:22.4770030Z wrapdim_test 2025-10-10T02:41:22.4770435Z xla_tensor_test 2025-10-10T02:41:22.4770854Z + aten/tools/run_tests.sh build/bin 2025-10-10T02:41:22.4771523Z + set -e 2025-10-10T02:41:22.4771924Z ++ dirname aten/tools/run_tests.sh 2025-10-10T02:41:22.4786066Z + VALGRIND_SUP=/var/lib/jenkins/pytorch/aten/tools/valgrind.sup 2025-10-10T02:41:22.4786879Z + export CPP_TESTS_DIR=build/bin 2025-10-10T02:41:22.4787418Z + CPP_TESTS_DIR=build/bin 2025-10-10T02:41:22.4787881Z + VALGRIND=OFF 2025-10-10T02:41:22.4790799Z + python test/run_test.py --cpp --verbose -i cpp/basic cpp/atest cpp/scalar_test cpp/broadcast_test cpp/wrapdim_test cpp/apply_utils_test cpp/dlconvertor_test cpp/native_test cpp/scalar_tensor_test cpp/undefined_tensor_test cpp/extension_backend_test cpp/lazy_tensor_test cpp/tensor_iterator_test cpp/Dimname_test cpp/Dict_test cpp/NamedTensor_test cpp/cpu_generator_test cpp/legacy_vmap_test cpp/operators_test 2025-10-10T02:41:26.3671401Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-10-10T02:41:26.3757967Z Found test times from artifacts 2025-10-10T02:41:26.4128300Z Found test times from artifacts 2025-10-10T02:41:26.4136751Z Running all tests 2025-10-10T02:41:26.4142773Z Running parallel tests on 8 processes 2025-10-10T02:41:26.4143465Z Name: tests to run (est. time: 0.0min) 2025-10-10T02:41:26.4144003Z Serial tests (0): 2025-10-10T02:41:26.4144456Z Parallel tests (19): 2025-10-10T02:41:26.4144921Z cpp/Dict_test 1/1 2025-10-10T02:41:26.4145358Z cpp/Dimname_test 1/1 2025-10-10T02:41:26.4145828Z cpp/NamedTensor_test 1/1 2025-10-10T02:41:26.4146319Z cpp/apply_utils_test 1/1 2025-10-10T02:41:26.4146768Z cpp/atest 1/1 2025-10-10T02:41:26.4147168Z cpp/basic 1/1 2025-10-10T02:41:26.4147585Z cpp/broadcast_test 1/1 2025-10-10T02:41:26.4148064Z cpp/cpu_generator_test 1/1 2025-10-10T02:41:26.4148574Z cpp/dlconvertor_test 1/1 2025-10-10T02:41:26.4149067Z cpp/extension_backend_test 1/1 2025-10-10T02:41:26.4149594Z cpp/lazy_tensor_test 1/1 2025-10-10T02:41:26.4150063Z cpp/legacy_vmap_test 1/1 2025-10-10T02:41:26.4150531Z cpp/native_test 1/1 2025-10-10T02:41:26.4150981Z cpp/operators_test 1/1 2025-10-10T02:41:26.4151878Z cpp/scalar_tensor_test 1/1 2025-10-10T02:41:26.4152370Z cpp/scalar_test 1/1 2025-10-10T02:41:26.4152829Z cpp/tensor_iterator_test 1/1 2025-10-10T02:41:26.4153336Z cpp/undefined_tensor_test 1/1 2025-10-10T02:41:26.4153838Z cpp/wrapdim_test 1/1 2025-10-10T02:41:26.4154454Z Name: excluded (est. time: 0.0min) 2025-10-10T02:41:26.4154939Z Serial tests (0): 2025-10-10T02:41:26.4155352Z Parallel tests (0): 2025-10-10T02:41:26.4155911Z Running cpp/Dict_test 1/1 ... [2025-10-10 02:41:26.414504] 2025-10-10T02:41:26.4156594Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:41:26.4157844Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/Dict_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:41:26.414763] 2025-10-10T02:41:33.4987782Z 2025-10-10T02:41:33.4989502Z cpp/Dict_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.Dict_test_1.1_cc91531ca1a95011_.log 2025-10-10T02:41:33.4990814Z 2025-10-10T02:41:33.4991432Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T02:41:33.4992299Z Uploading artifacts took 0.00 seconds 2025-10-10T02:41:33.4992998Z Running cpp/Dimname_test 1/1 ... [2025-10-10 02:41:33.498566] 2025-10-10T02:41:33.4993699Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:41:33.4996383Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/Dimname_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:41:33.499127] 2025-10-10T02:41:40.4841991Z 2025-10-10T02:41:40.4843557Z cpp/Dimname_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.Dimname_test_1.1_2a9073607637a43f_.log 2025-10-10T02:41:40.4844654Z 2025-10-10T02:41:40.4845054Z Running cpp/NamedTensor_test 1/1 ... [2025-10-10 02:41:40.483670] 2025-10-10T02:41:40.4845788Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:41:40.4847983Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/NamedTensor_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:41:40.484219] 2025-10-10T02:41:47.4683815Z 2025-10-10T02:41:47.4685186Z cpp/NamedTensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.NamedTensor_test_1.1_493016db2f68a6c0_.log 2025-10-10T02:41:47.4686302Z 2025-10-10T02:41:47.4686718Z Running cpp/apply_utils_test 1/1 ... [2025-10-10 02:41:47.467928] 2025-10-10T02:41:47.4687458Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:41:47.4690664Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/apply_utils_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:41:47.468497] 2025-10-10T02:41:54.3010088Z 2025-10-10T02:41:54.3012188Z cpp/apply_utils_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.apply_utils_test_1.1_baf53206584b1236_.log 2025-10-10T02:41:54.3013338Z 2025-10-10T02:41:54.3013632Z Running cpp/atest 1/1 ... [2025-10-10 02:41:54.300459] 2025-10-10T02:41:54.3014281Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:41:54.3015596Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/atest', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:41:54.301061] 2025-10-10T02:42:01.1340073Z 2025-10-10T02:42:01.1341562Z cpp/atest 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.atest_1.1_879d3bf27bfcb038_.log 2025-10-10T02:42:01.1342571Z 2025-10-10T02:42:01.1342855Z Running cpp/basic 1/1 ... [2025-10-10 02:42:01.133552] 2025-10-10T02:42:01.1343524Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:01.1345169Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/basic', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:01.134133] 2025-10-10T02:42:08.0673395Z 2025-10-10T02:42:08.0675496Z cpp/basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.basic_1.1_7c1349299d73aa9e_.log 2025-10-10T02:42:08.0676601Z 2025-10-10T02:42:08.0676961Z Running cpp/broadcast_test 1/1 ... [2025-10-10 02:42:08.066894] 2025-10-10T02:42:08.0677680Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:08.0680957Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/broadcast_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:08.067458] 2025-10-10T02:42:15.1005744Z 2025-10-10T02:42:15.1007217Z cpp/broadcast_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.broadcast_test_1.1_bbe6a62badb6aac3_.log 2025-10-10T02:42:15.1008578Z 2025-10-10T02:42:15.1009031Z Running cpp/cpu_generator_test 1/1 ... [2025-10-10 02:42:15.100098] 2025-10-10T02:42:15.1009829Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:15.1011952Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/cpu_generator_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:15.100650] 2025-10-10T02:42:21.9346110Z 2025-10-10T02:42:21.9347570Z cpp/cpu_generator_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.cpu_generator_test_1.1_23de6d7774039a7b_.log 2025-10-10T02:42:21.9348733Z 2025-10-10T02:42:21.9349112Z Running cpp/dlconvertor_test 1/1 ... [2025-10-10 02:42:21.934029] 2025-10-10T02:42:21.9350631Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:21.9352042Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/dlconvertor_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:21.934559] 2025-10-10T02:42:29.1200128Z 2025-10-10T02:42:29.1201781Z cpp/dlconvertor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.dlconvertor_test_1.1_c3dd268a1e86718c_.log 2025-10-10T02:42:29.1202970Z 2025-10-10T02:42:29.1203432Z Running cpp/extension_backend_test 1/1 ... [2025-10-10 02:42:29.119539] 2025-10-10T02:42:29.1205259Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:29.1206996Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/extension_backend_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:29.120086] 2025-10-10T02:42:35.9526460Z 2025-10-10T02:42:35.9528128Z cpp/extension_backend_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.extension_backend_test_1.1_696e6b397982b943_.log 2025-10-10T02:42:35.9529378Z 2025-10-10T02:42:35.9529738Z Running cpp/lazy_tensor_test 1/1 ... [2025-10-10 02:42:35.951992] 2025-10-10T02:42:35.9530489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:35.9531785Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/lazy_tensor_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:35.952311] 2025-10-10T02:42:42.9350201Z 2025-10-10T02:42:42.9351772Z cpp/lazy_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.lazy_tensor_test_1.1_fe47aa051ab01d39_.log 2025-10-10T02:42:42.9352942Z 2025-10-10T02:42:42.9353324Z Running cpp/legacy_vmap_test 1/1 ... [2025-10-10 02:42:42.934585] 2025-10-10T02:42:42.9354265Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:42.9358114Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/legacy_vmap_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:42.935122] 2025-10-10T02:42:49.9181794Z 2025-10-10T02:42:49.9183447Z cpp/legacy_vmap_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.legacy_vmap_test_1.1_c2dbd97741b6260d_.log 2025-10-10T02:42:49.9184580Z 2025-10-10T02:42:49.9184903Z Running cpp/native_test 1/1 ... [2025-10-10 02:42:49.917732] 2025-10-10T02:42:49.9185649Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:49.9189989Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/native_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:49.918331] 2025-10-10T02:42:56.9528373Z 2025-10-10T02:42:56.9529890Z cpp/native_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.native_test_1.1_9ad2b7958ecd8f88_.log 2025-10-10T02:42:56.9531005Z 2025-10-10T02:42:56.9531368Z Running cpp/operators_test 1/1 ... [2025-10-10 02:42:56.952338] 2025-10-10T02:42:56.9532108Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:42:56.9535123Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/operators_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:42:56.952969] 2025-10-10T02:43:03.8868297Z 2025-10-10T02:43:03.8869837Z cpp/operators_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.operators_test_1.1_6ae8544a8a017f59_.log 2025-10-10T02:43:03.8870983Z 2025-10-10T02:43:03.8871380Z Running cpp/scalar_tensor_test 1/1 ... [2025-10-10 02:43:03.886366] 2025-10-10T02:43:03.8872159Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:03.8875474Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/scalar_tensor_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:03.886984] 2025-10-10T02:43:10.8701806Z 2025-10-10T02:43:10.8703626Z cpp/scalar_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.scalar_tensor_test_1.1_727d61c37c472c39_.log 2025-10-10T02:43:10.8715449Z 2025-10-10T02:43:10.8715853Z Running cpp/scalar_test 1/1 ... [2025-10-10 02:43:10.869721] 2025-10-10T02:43:10.8716580Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:10.8717878Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/scalar_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:10.870346] 2025-10-10T02:43:17.8053902Z 2025-10-10T02:43:17.8055667Z cpp/scalar_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.scalar_test_1.1_44db9a5e005cae85_.log 2025-10-10T02:43:17.8057783Z 2025-10-10T02:43:17.8083200Z Running cpp/tensor_iterator_test 1/1 ... [2025-10-10 02:43:17.804950] 2025-10-10T02:43:17.8084077Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:17.8085735Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/tensor_iterator_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:17.805575] 2025-10-10T02:43:24.8387449Z 2025-10-10T02:43:24.8389137Z cpp/tensor_iterator_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.tensor_iterator_test_1.1_f58de87a9e27763b_.log 2025-10-10T02:43:24.8390357Z 2025-10-10T02:43:24.8390763Z Running cpp/undefined_tensor_test 1/1 ... [2025-10-10 02:43:24.838335] 2025-10-10T02:43:24.8391574Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:24.8395709Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/undefined_tensor_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:24.838986] 2025-10-10T02:43:32.0226208Z 2025-10-10T02:43:32.0227729Z cpp/undefined_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.undefined_tensor_test_1.1_8a139c022d3bc0b6_.log 2025-10-10T02:43:32.0228956Z 2025-10-10T02:43:32.0235235Z Running cpp/wrapdim_test 1/1 ... [2025-10-10 02:43:32.022217] 2025-10-10T02:43:32.0236236Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:32.0237818Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/wrapdim_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:32.022852] 2025-10-10T02:43:38.8070310Z 2025-10-10T02:43:38.8072170Z cpp/wrapdim_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.wrapdim_test_1.1_24466d257c30f09b_.log 2025-10-10T02:43:38.8073364Z 2025-10-10T02:43:42.3596655Z Running cpp/Dict_test 1/1 ... [2025-10-10 02:43:42.358225] 2025-10-10T02:43:42.3597373Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.3598953Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/Dict_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.358771] 2025-10-10T02:43:42.5167569Z Running cpp/Dimname_test 1/1 ... [2025-10-10 02:43:42.516111] 2025-10-10T02:43:42.5168367Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.5170037Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/Dimname_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.516651] 2025-10-10T02:43:42.6324754Z Running cpp/NamedTensor_test 1/1 ... [2025-10-10 02:43:42.631566] 2025-10-10T02:43:42.6325594Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6326985Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/NamedTensor_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.632238] 2025-10-10T02:43:42.6615180Z Running cpp/apply_utils_test 1/1 ... [2025-10-10 02:43:42.660946] 2025-10-10T02:43:42.6616168Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6618260Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/apply_utils_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.661467] 2025-10-10T02:43:42.6631585Z Running cpp/atest 1/1 ... [2025-10-10 02:43:42.662751] 2025-10-10T02:43:42.6632354Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6633472Z Running cpp/basic 1/1 ... [2025-10-10 02:43:42.662830] 2025-10-10T02:43:42.6634303Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6639784Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/atest', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.663539] 2025-10-10T02:43:42.6641703Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/basic', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.663670] 2025-10-10T02:43:42.6848595Z Running cpp/broadcast_test 1/1 ... [2025-10-10 02:43:42.684282] 2025-10-10T02:43:42.6849443Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6850194Z Running cpp/cpu_generator_test 1/1 ... [2025-10-10 02:43:42.684621] 2025-10-10T02:43:42.6850941Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:42.6852306Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/broadcast_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.684771] 2025-10-10T02:43:42.6855183Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/cpu_generator_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:42.685099] 2025-10-10T02:43:52.4976286Z 2025-10-10T02:43:52.4977385Z cpp/Dict_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.Dict_test_1.1_2f28e0158297b5b3_.log 2025-10-10T02:43:52.4978523Z 2025-10-10T02:43:52.4978767Z Running cpp/dlconvertor_test 1/1 ... [2025-10-10 02:43:52.497527] 2025-10-10T02:43:52.4979239Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:52.4982944Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/dlconvertor_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:52.497930] 2025-10-10T02:43:54.3287606Z 2025-10-10T02:43:54.3289178Z cpp/cpu_generator_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.cpu_generator_test_1.1_1ae8085c087b8a2b_.log 2025-10-10T02:43:54.3290384Z 2025-10-10T02:43:54.3290804Z Running cpp/extension_backend_test 1/1 ... [2025-10-10 02:43:54.328638] 2025-10-10T02:43:54.3291588Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.3297286Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/extension_backend_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.329308] 2025-10-10T02:43:54.4067917Z 2025-10-10T02:43:54.4069737Z cpp/atest 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.atest_1.1_42f4e4135d33ecb1_.log 2025-10-10T02:43:54.4070770Z 2025-10-10T02:43:54.4071125Z Running cpp/lazy_tensor_test 1/1 ... [2025-10-10 02:43:54.406881] 2025-10-10T02:43:54.4071865Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.4075715Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/lazy_tensor_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.407299] 2025-10-10T02:43:54.4555520Z 2025-10-10T02:43:54.4556829Z cpp/apply_utils_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.apply_utils_test_1.1_9034d2d633063e50_.log 2025-10-10T02:43:54.4557984Z 2025-10-10T02:43:54.4558371Z Running cpp/legacy_vmap_test 1/1 ... [2025-10-10 02:43:54.454909] 2025-10-10T02:43:54.4559127Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.4560476Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/legacy_vmap_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.455551] 2025-10-10T02:43:54.4794527Z 2025-10-10T02:43:54.4795832Z cpp/broadcast_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.broadcast_test_1.1_9c38d6fd51c7b56d_.log 2025-10-10T02:43:54.4796929Z 2025-10-10T02:43:54.4797274Z Running cpp/native_test 1/1 ... [2025-10-10 02:43:54.478772] 2025-10-10T02:43:54.4798409Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.4799719Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/native_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.479171] 2025-10-10T02:43:54.5076424Z 2025-10-10T02:43:54.5078403Z cpp/basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.basic_1.1_43f408fb13575700_.log 2025-10-10T02:43:54.5079492Z 2025-10-10T02:43:54.5079851Z Running cpp/operators_test 1/1 ... [2025-10-10 02:43:54.507365] 2025-10-10T02:43:54.5081097Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.5084499Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/operators_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.507993] 2025-10-10T02:43:54.5123853Z 2025-10-10T02:43:54.5125046Z cpp/Dimname_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.Dimname_test_1.1_6c948343617a98a3_.log 2025-10-10T02:43:54.5126140Z 2025-10-10T02:43:54.5128343Z Running cpp/scalar_tensor_test 1/1 ... [2025-10-10 02:43:54.512483] 2025-10-10T02:43:54.5129138Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.5136298Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/scalar_tensor_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.513115] 2025-10-10T02:43:54.5250806Z 2025-10-10T02:43:54.5252258Z cpp/NamedTensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.NamedTensor_test_1.1_dfeb25ac99b55d7c_.log 2025-10-10T02:43:54.5253419Z 2025-10-10T02:43:54.5253742Z Running cpp/scalar_test 1/1 ... [2025-10-10 02:43:54.524396] 2025-10-10T02:43:54.5254484Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:43:54.5255787Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/scalar_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:43:54.524785] 2025-10-10T02:44:01.8438706Z 2025-10-10T02:44:01.8440479Z cpp/dlconvertor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.dlconvertor_test_1.1_b78fd1c463e3ad9a_.log 2025-10-10T02:44:01.8441359Z 2025-10-10T02:44:01.8441655Z Running cpp/tensor_iterator_test 1/1 ... [2025-10-10 02:44:01.843613] 2025-10-10T02:44:01.8442262Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:01.8444646Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/tensor_iterator_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:44:01.844048] 2025-10-10T02:44:07.1035287Z 2025-10-10T02:44:07.1037291Z cpp/legacy_vmap_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.legacy_vmap_test_1.1_1533bb91bab65f84_.log 2025-10-10T02:44:07.1038659Z 2025-10-10T02:44:07.1039108Z Running cpp/undefined_tensor_test 1/1 ... [2025-10-10 02:44:07.103254] 2025-10-10T02:44:07.1040095Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:07.1041725Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/undefined_tensor_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:44:07.103714] 2025-10-10T02:44:07.1524550Z 2025-10-10T02:44:07.1525356Z cpp/lazy_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.lazy_tensor_test_1.1_9171d9f58c04c2dd_.log 2025-10-10T02:44:07.1526079Z 2025-10-10T02:44:07.1526301Z Running cpp/wrapdim_test 1/1 ... [2025-10-10 02:44:07.152366] 2025-10-10T02:44:07.1526784Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:07.1531373Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/wrapdim_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:44:07.152786] 2025-10-10T02:44:07.1586912Z 2025-10-10T02:44:07.1588086Z cpp/scalar_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.scalar_tensor_test_1.1_409fa4c109a0922e_.log 2025-10-10T02:44:07.1588820Z 2025-10-10T02:44:07.1747673Z 2025-10-10T02:44:07.1748458Z cpp/native_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.native_test_1.1_fff5b7111ab0236a_.log 2025-10-10T02:44:07.1749097Z 2025-10-10T02:44:07.1752653Z 2025-10-10T02:44:07.1753462Z cpp/extension_backend_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.extension_backend_test_1.1_63aab948a54b509d_.log 2025-10-10T02:44:07.1754584Z 2025-10-10T02:44:07.3175190Z 2025-10-10T02:44:07.3176085Z cpp/scalar_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.scalar_test_1.1_7c4cc710b62550d1_.log 2025-10-10T02:44:07.3176724Z 2025-10-10T02:44:07.3528149Z 2025-10-10T02:44:07.3528928Z cpp/operators_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.operators_test_1.1_32482972fa000c97_.log 2025-10-10T02:44:07.3529625Z 2025-10-10T02:44:10.3288286Z 2025-10-10T02:44:10.3289562Z cpp/tensor_iterator_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.tensor_iterator_test_1.1_fbd7ca1d376e3c13_.log 2025-10-10T02:44:10.3290350Z 2025-10-10T02:44:15.4911121Z 2025-10-10T02:44:15.4912856Z cpp/undefined_tensor_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.undefined_tensor_test_1.1_a168b7d36cfd1e0d_.log 2025-10-10T02:44:15.4915179Z 2025-10-10T02:44:15.8409807Z 2025-10-10T02:44:15.8411126Z cpp/wrapdim_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.wrapdim_test_1.1_82257921f62d4609_.log 2025-10-10T02:44:15.8412207Z 2025-10-10T02:44:16.7704211Z Running test batch 'tests to run' cost 170.36 seconds 2025-10-10T02:44:17.5025060Z + run_if_exists tensor_interop_test 2025-10-10T02:44:17.5025663Z + local test_name=tensor_interop_test 2025-10-10T02:44:17.5026125Z + [[ -x build/bin/tensor_interop_test ]] 2025-10-10T02:44:17.5026587Z + echo 'Warning: tensor_interop_test does not exist.' 2025-10-10T02:44:17.5027052Z Warning: tensor_interop_test does not exist. 2025-10-10T02:44:17.5027419Z + run_if_exists cudnn_test 2025-10-10T02:44:17.5027725Z + local test_name=cudnn_test 2025-10-10T02:44:17.5028024Z + [[ -x build/bin/cudnn_test ]] 2025-10-10T02:44:17.5028353Z + echo 'Warning: cudnn_test does not exist.' 2025-10-10T02:44:17.5028721Z Warning: cudnn_test does not exist. 2025-10-10T02:44:17.5029067Z + run_if_exists cuda_generator_test 2025-10-10T02:44:17.5029412Z + local test_name=cuda_generator_test 2025-10-10T02:44:17.5030350Z + [[ -x build/bin/cuda_generator_test ]] 2025-10-10T02:44:17.5030752Z + echo 'Warning: cuda_generator_test does not exist.' 2025-10-10T02:44:17.5031155Z Warning: cuda_generator_test does not exist. 2025-10-10T02:44:17.5031505Z + run_if_exists apply_test 2025-10-10T02:44:17.5031795Z + local test_name=apply_test 2025-10-10T02:44:17.5032103Z + [[ -x build/bin/apply_test ]] 2025-10-10T02:44:17.5032444Z + echo 'Warning: apply_test does not exist.' 2025-10-10T02:44:17.5032799Z Warning: apply_test does not exist. 2025-10-10T02:44:17.5033121Z + run_if_exists stream_test 2025-10-10T02:44:17.5033420Z + local test_name=stream_test 2025-10-10T02:44:17.5033720Z + [[ -x build/bin/stream_test ]] 2025-10-10T02:44:17.5034056Z + echo 'Warning: stream_test does not exist.' 2025-10-10T02:44:17.5034580Z Warning: stream_test does not exist. 2025-10-10T02:44:17.5034907Z + run_if_exists cuda_half_test 2025-10-10T02:44:17.5035207Z + local test_name=cuda_half_test 2025-10-10T02:44:17.5035513Z + [[ -x build/bin/cuda_half_test ]] 2025-10-10T02:44:17.5035873Z + echo 'Warning: cuda_half_test does not exist.' 2025-10-10T02:44:17.5036241Z Warning: cuda_half_test does not exist. 2025-10-10T02:44:17.5036580Z + run_if_exists cuda_vectorized_test 2025-10-10T02:44:17.5036913Z + local test_name=cuda_vectorized_test 2025-10-10T02:44:17.5037254Z + [[ -x build/bin/cuda_vectorized_test ]] 2025-10-10T02:44:17.5037835Z + echo 'Warning: cuda_vectorized_test does not exist.' 2025-10-10T02:44:17.5038262Z Warning: cuda_vectorized_test does not exist. 2025-10-10T02:44:17.5038657Z + run_if_exists cuda_distributions_test 2025-10-10T02:44:17.5039030Z + local test_name=cuda_distributions_test 2025-10-10T02:44:17.5039387Z + [[ -x build/bin/cuda_distributions_test ]] 2025-10-10T02:44:17.5039792Z + echo 'Warning: cuda_distributions_test does not exist.' 2025-10-10T02:44:17.5040224Z Warning: cuda_distributions_test does not exist. 2025-10-10T02:44:17.5040593Z + run_if_exists cuda_optional_test 2025-10-10T02:44:17.5041097Z + local test_name=cuda_optional_test 2025-10-10T02:44:17.5041422Z + [[ -x build/bin/cuda_optional_test ]] 2025-10-10T02:44:17.5041809Z + echo 'Warning: cuda_optional_test does not exist.' 2025-10-10T02:44:17.5042207Z Warning: cuda_optional_test does not exist. 2025-10-10T02:44:17.5042577Z + run_if_exists cuda_tensor_interop_test 2025-10-10T02:44:17.5042928Z + local test_name=cuda_tensor_interop_test 2025-10-10T02:44:17.5043291Z + [[ -x build/bin/cuda_tensor_interop_test ]] 2025-10-10T02:44:17.5043693Z + echo 'Warning: cuda_tensor_interop_test does not exist.' 2025-10-10T02:44:17.5044124Z Warning: cuda_tensor_interop_test does not exist. 2025-10-10T02:44:17.5044502Z + run_if_exists cuda_complex_test 2025-10-10T02:44:17.5044825Z + local test_name=cuda_complex_test 2025-10-10T02:44:17.5045156Z + [[ -x build/bin/cuda_complex_test ]] 2025-10-10T02:44:17.5045684Z + echo 'Warning: cuda_complex_test does not exist.' 2025-10-10T02:44:17.5046075Z Warning: cuda_complex_test does not exist. 2025-10-10T02:44:17.5046434Z + run_if_exists cuda_complex_math_test 2025-10-10T02:44:17.5046777Z + local test_name=cuda_complex_math_test 2025-10-10T02:44:17.5047132Z + [[ -x build/bin/cuda_complex_math_test ]] 2025-10-10T02:44:17.5047531Z + echo 'Warning: cuda_complex_math_test does not exist.' 2025-10-10T02:44:17.5047958Z Warning: cuda_complex_math_test does not exist. 2025-10-10T02:44:17.5048316Z + run_if_exists cuda_cub_test 2025-10-10T02:44:17.5048621Z + local test_name=cuda_cub_test 2025-10-10T02:44:17.5048929Z + [[ -x build/bin/cuda_cub_test ]] 2025-10-10T02:44:17.5049269Z + echo 'Warning: cuda_cub_test does not exist.' 2025-10-10T02:44:17.5049623Z Warning: cuda_cub_test does not exist. 2025-10-10T02:44:17.5049960Z + run_if_exists cuda_atomic_ops_test 2025-10-10T02:44:17.5050293Z + local test_name=cuda_atomic_ops_test 2025-10-10T02:44:17.5050630Z + [[ -x build/bin/cuda_atomic_ops_test ]] 2025-10-10T02:44:17.5051014Z + echo 'Warning: cuda_atomic_ops_test does not exist.' 2025-10-10T02:44:17.5051422Z Warning: cuda_atomic_ops_test does not exist. 2025-10-10T02:44:17.5051898Z + run_if_exists cuda_allocator_test 2025-10-10T02:44:17.5052232Z + local test_name=cuda_allocator_test 2025-10-10T02:44:17.5052569Z + [[ -x build/bin/cuda_allocator_test ]] 2025-10-10T02:44:17.5052942Z + echo 'Warning: cuda_allocator_test does not exist.' 2025-10-10T02:44:17.5053340Z Warning: cuda_allocator_test does not exist. 2025-10-10T02:44:17.5053668Z + '[' OFF == ON ']' 2025-10-10T02:44:17.5053933Z + [[ -n '' ]] 2025-10-10T02:44:17.5054177Z + assert_git_not_dirty 2025-10-10T02:44:17.5054476Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-10-10T02:44:17.5054807Z + test_libtorch 1 2025-10-10T02:44:17.5055058Z + local SHARD=1 2025-10-10T02:44:17.5055296Z + [[ default != \s\l\o\w ]] 2025-10-10T02:44:17.5055583Z + echo 'Testing libtorch' 2025-10-10T02:44:17.5055861Z Testing libtorch 2025-10-10T02:44:17.5056654Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libbackend_with_compiler.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5085182Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libjitbackend_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5122074Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libcaffe2_nvrtc.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5160849Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_hip.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5192317Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libshm /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libshm.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libshm_windows /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5231851Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_hip.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5263521Z + ln -sf '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libnvfuser*' /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5300371Z + export CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5301914Z + CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:44:17.5302710Z + [[ -z 1 ]] 2025-10-10T02:44:17.5303117Z + [[ 1 == \1 ]] 2025-10-10T02:44:17.5303519Z + test_libtorch_api 2025-10-10T02:44:17.5304053Z + MNIST_DIR=/var/lib/jenkins/pytorch/test/cpp/api/mnist 2025-10-10T02:44:17.5305098Z + python tools/download_mnist.py --quiet -d /var/lib/jenkins/pytorch/test/cpp/api/mnist 2025-10-10T02:44:17.6001890Z Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz ... 2025-10-10T02:44:19.0788220Z Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz ... 2025-10-10T02:44:19.2305215Z Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz ... 2025-10-10T02:44:19.5992503Z Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz ... 2025-10-10T02:44:19.7224516Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-10-10T02:44:19.7225307Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-10-10T02:44:19.7226018Z + OMP_NUM_THREADS=2 2025-10-10T02:44:19.7226655Z + TORCH_CPP_TEST_MNIST_PATH=/var/lib/jenkins/pytorch/test/cpp/api/mnist 2025-10-10T02:44:19.7228378Z + python test/run_test.py --cpp --verbose -i cpp/test_api -k 'not IMethodTest' 2025-10-10T02:44:23.5839486Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-10-10T02:44:23.5926374Z Found test times from artifacts 2025-10-10T02:44:23.6303795Z Found test times from artifacts 2025-10-10T02:44:23.6312373Z Running all tests 2025-10-10T02:44:23.6315808Z Running parallel tests on 8 processes 2025-10-10T02:44:23.6316469Z Name: tests to run (est. time: 0.0min) 2025-10-10T02:44:23.6317005Z Serial tests (0): 2025-10-10T02:44:23.6317436Z Parallel tests (1): 2025-10-10T02:44:23.6317896Z cpp/test_api 1/1 2025-10-10T02:44:23.6318352Z Name: excluded (est. time: 0.0min) 2025-10-10T02:44:23.6318883Z Serial tests (0): 2025-10-10T02:44:23.6319319Z Parallel tests (0): 2025-10-10T02:44:23.6320142Z Running cpp/test_api 1/1 ... [2025-10-10 02:44:23.631603] 2025-10-10T02:44:23.6320847Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:23.6323128Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_api', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-k', 'not IMethodTest', '-x', '--reruns=2'] ... [2025-10-10 02:44:23.631896] 2025-10-10T02:44:31.1222998Z 2025-10-10T02:44:31.1224254Z cpp/test_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_api_1.1_eaecc0fdfeccb351_.log 2025-10-10T02:44:31.1229078Z 2025-10-10T02:44:31.1229590Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T02:44:31.1230491Z Uploading artifacts took 0.00 seconds 2025-10-10T02:44:34.4095942Z Running cpp/test_api 1/1 ... [2025-10-10 02:44:34.406819] 2025-10-10T02:44:34.4096858Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:34.4098532Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_api', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-k', 'not IMethodTest', '-x', '--reruns=2'] ... [2025-10-10 02:44:34.407537] 2025-10-10T02:44:41.7954729Z 2025-10-10T02:44:41.7956392Z cpp/test_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_api_1.1_72751c5b606b3e10_.log 2025-10-10T02:44:41.7957479Z 2025-10-10T02:44:42.4465766Z Running test batch 'tests to run' cost 18.81 seconds 2025-10-10T02:44:42.9825611Z + [[ linux-jammy-rocm-py3.10 != *android* ]] 2025-10-10T02:44:42.9827634Z + [[ linux-jammy-rocm-py3.10 != *cuda* ]] 2025-10-10T02:44:42.9828540Z + [[ linux-jammy-rocm-py3.10 != *asan* ]] 2025-10-10T02:44:42.9829191Z + [[ linux-jammy-rocm-py3.10 != *s390x* ]] 2025-10-10T02:44:42.9829776Z + export CPP_TESTS_DIR=build/bin 2025-10-10T02:44:42.9830321Z + CPP_TESTS_DIR=build/bin 2025-10-10T02:44:42.9831037Z + python test/run_test.py --cpp --verbose -i cpp/static_runtime_test 2025-10-10T02:44:46.8869016Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-10-10T02:44:46.8957239Z Found test times from artifacts 2025-10-10T02:44:46.9335210Z Found test times from artifacts 2025-10-10T02:44:46.9345882Z Running all tests 2025-10-10T02:44:46.9351705Z Running parallel tests on 8 processes 2025-10-10T02:44:46.9352549Z Name: tests to run (est. time: 0.0min) 2025-10-10T02:44:46.9353114Z Serial tests (0): 2025-10-10T02:44:46.9353555Z Parallel tests (1): 2025-10-10T02:44:46.9354057Z cpp/static_runtime_test 1/1 2025-10-10T02:44:46.9355167Z Name: excluded (est. time: 0.0min) 2025-10-10T02:44:46.9355673Z Serial tests (0): 2025-10-10T02:44:46.9356145Z Parallel tests (0): 2025-10-10T02:44:46.9356887Z Running cpp/static_runtime_test 1/1 ... [2025-10-10 02:44:46.934916] 2025-10-10T02:44:46.9357795Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:46.9359244Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/static_runtime_test', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:44:46.935201] 2025-10-10T02:44:54.0208270Z 2025-10-10T02:44:54.0210956Z cpp/static_runtime_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.static_runtime_test_1.1_9f7714fcc8b8e346_.log 2025-10-10T02:44:54.0212204Z 2025-10-10T02:44:54.0212673Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-10-10T02:44:54.0213503Z Uploading artifacts took 0.00 seconds 2025-10-10T02:44:57.4176691Z Running cpp/static_runtime_test 1/1 ... [2025-10-10 02:44:57.417020] 2025-10-10T02:44:57.4177263Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-10-10T02:44:57.4178127Z Executing ['pytest', '/var/lib/jenkins/pytorch/build/bin/static_runtime_test', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '8', '-x', '--reruns=2'] ... [2025-10-10 02:44:57.417449] 2025-10-10T02:45:04.9542388Z 2025-10-10T02:45:04.9543901Z cpp/static_runtime_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.static_runtime_test_1.1_a7a19d5f08be9836_.log 2025-10-10T02:45:04.9545183Z 2025-10-10T02:45:05.6669281Z Running test batch 'tests to run' cost 18.73 seconds 2025-10-10T02:45:06.2185552Z + [[ -z 1 ]] 2025-10-10T02:45:06.2186106Z + [[ 1 == \2 ]] 2025-10-10T02:45:06.2187030Z + assert_git_not_dirty 2025-10-10T02:45:06.2187613Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-10-10T02:45:06.2188252Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-10-10T02:45:06.2188871Z + sccache_epilogue 2025-10-10T02:45:06.2189881Z + echo '::group::Sccache Compilation Log' 2025-10-10T02:45:06.2190987Z ##[group]Sccache Compilation Log 2025-10-10T02:45:06.2191610Z + echo '=================== sccache compilation log ===================' 2025-10-10T02:45:06.2192325Z =================== sccache compilation log =================== 2025-10-10T02:45:06.2193377Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-10-10T02:45:06.2478846Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-10-10T02:45:06.2480593Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-10-10T02:45:06.2481478Z + sccache --show-stats 2025-10-10T02:45:06.2541209Z Compile requests 3985 2025-10-10T02:45:06.2541899Z Compile requests executed 108 2025-10-10T02:45:06.2542518Z Cache hits 14 2025-10-10T02:45:06.2543049Z Cache hits (C/C++) 8 2025-10-10T02:45:06.2543594Z Cache hits (HIP) 6 2025-10-10T02:45:06.2544134Z Cache misses 93 2025-10-10T02:45:06.2544674Z Cache misses (C/C++) 79 2025-10-10T02:45:06.2545187Z Cache misses (HIP) 14 2025-10-10T02:45:06.2545733Z Cache hits rate 13.08 % 2025-10-10T02:45:06.2546678Z Cache hits rate (C/C++) 9.20 % 2025-10-10T02:45:06.2547215Z Cache hits rate (HIP) 30.00 % 2025-10-10T02:45:06.2547757Z Cache timeouts 0 2025-10-10T02:45:06.2548281Z Cache read errors 0 2025-10-10T02:45:06.2548824Z Forced recaches 0 2025-10-10T02:45:06.2549338Z Cache write errors 0 2025-10-10T02:45:06.2549858Z Cache errors 0 2025-10-10T02:45:06.2550404Z Compilations 93 2025-10-10T02:45:06.2550945Z Compilation failures 1 2025-10-10T02:45:06.2551511Z Non-cacheable compilations 0 2025-10-10T02:45:06.2552070Z Non-cacheable calls 4 2025-10-10T02:45:06.2552605Z Non-compilation calls 3873 2025-10-10T02:45:06.2553291Z Unsupported compiler calls 0 2025-10-10T02:45:06.2553962Z Average cache write 0.001 s 2025-10-10T02:45:06.2554885Z Average compiler 14.709 s 2025-10-10T02:45:06.2555576Z Average cache read hit 0.000 s 2025-10-10T02:45:06.2556235Z Failed distributed compilations 0 2025-10-10T02:45:06.2556674Z 2025-10-10T02:45:06.2556900Z Non-cacheable reasons: 2025-10-10T02:45:06.2557749Z -E 4 2025-10-10T02:45:06.2558190Z 2025-10-10T02:45:06.2558612Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-10-10T02:45:06.2559406Z Use direct/preprocessor mode? yes 2025-10-10T02:45:06.2559971Z Version (client) 0.10.0 2025-10-10T02:45:06.2560515Z Cache size 51 MiB 2025-10-10T02:45:06.2561102Z Max cache size 10 GiB 2025-10-10T02:45:06.2569625Z + sccache --stop-server 2025-10-10T02:45:06.2612004Z Stopping sccache server... 2025-10-10T02:45:06.2619785Z Compile requests 3985 2025-10-10T02:45:06.2620539Z Compile requests executed 108 2025-10-10T02:45:06.2621126Z Cache hits 14 2025-10-10T02:45:06.2621703Z Cache hits (C/C++) 8 2025-10-10T02:45:06.2622236Z Cache hits (HIP) 6 2025-10-10T02:45:06.2622823Z Cache misses 93 2025-10-10T02:45:06.2623462Z Cache misses (C/C++) 79 2025-10-10T02:45:06.2624085Z Cache misses (HIP) 14 2025-10-10T02:45:06.2624725Z Cache hits rate 13.08 % 2025-10-10T02:45:06.2625383Z Cache hits rate (C/C++) 9.20 % 2025-10-10T02:45:06.2625922Z Cache hits rate (HIP) 30.00 % 2025-10-10T02:45:06.2626450Z Cache timeouts 0 2025-10-10T02:45:06.2627514Z Cache read errors 0 2025-10-10T02:45:06.2628074Z Forced recaches 0 2025-10-10T02:45:06.2628607Z Cache write errors 0 2025-10-10T02:45:06.2629120Z Cache errors 0 2025-10-10T02:45:06.2629639Z Compilations 93 2025-10-10T02:45:06.2630168Z Compilation failures 1 2025-10-10T02:45:06.2630709Z Non-cacheable compilations 0 2025-10-10T02:45:06.2631489Z Non-cacheable calls 4 2025-10-10T02:45:06.2632020Z Non-compilation calls 3873 2025-10-10T02:45:06.2632558Z Unsupported compiler calls 0 2025-10-10T02:45:06.2633105Z Average cache write 0.001 s 2025-10-10T02:45:06.2633670Z Average compiler 14.709 s 2025-10-10T02:45:06.2634409Z Average cache read hit 0.000 s 2025-10-10T02:45:06.2634990Z Failed distributed compilations 0 2025-10-10T02:45:06.2635385Z 2025-10-10T02:45:06.2635590Z Non-cacheable reasons: 2025-10-10T02:45:06.2636045Z -E 4 2025-10-10T02:45:06.2636405Z 2025-10-10T02:45:06.2636756Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-10-10T02:45:06.2637505Z Use direct/preprocessor mode? yes 2025-10-10T02:45:06.2638061Z Version (client) 0.10.0 2025-10-10T02:45:06.2638896Z Cache size 51 MiB 2025-10-10T02:45:06.2639460Z Max cache size 10 GiB 2025-10-10T02:45:06.2640035Z + echo ::endgroup:: 2025-10-10T02:45:06.2641081Z ##[endgroup] 2025-10-10T02:45:06.2808776Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-10-10T02:45:06.2810238Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-10-10T02:45:06.2811965Z docker exec -t "496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-10-10T02:45:06.2868856Z shell: /usr/bin/bash -e {0} 2025-10-10T02:45:06.2869344Z env: 2025-10-10T02:45:06.2869727Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:06.2870491Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:06.2871635Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:06.2872658Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:06.2874576Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:06.2876138Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:06.2876632Z AWS_REGION: us-east-1 2025-10-10T02:45:06.2877201Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:06.2877843Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:06.2887469Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:06.2888228Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:06.2889051Z ##[endgroup] 2025-10-10T02:45:06.5053511Z ##[group]Run cat test/**/*_toprint.log || true 2025-10-10T02:45:06.5054249Z cat test/**/*_toprint.log || true 2025-10-10T02:45:06.5114846Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:06.5115535Z env: 2025-10-10T02:45:06.5115971Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:06.5116889Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:06.5118190Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:06.5119243Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:06.5120975Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:06.5122534Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:06.5123041Z AWS_REGION: us-east-1 2025-10-10T02:45:06.5123662Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:06.5124312Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:06.5134209Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:06.5134973Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:06.5135787Z ##[endgroup] 2025-10-10T02:45:06.5329148Z cat: 'test/**/*_toprint.log': No such file or directory 2025-10-10T02:45:06.5562693Z Prepare all required actions 2025-10-10T02:45:06.5564202Z Getting action download info 2025-10-10T02:45:06.8189516Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-10-10T02:45:07.3348742Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-10-10T02:45:07.9192555Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-10-10T02:45:07.9192812Z with: 2025-10-10T02:45:07.9192968Z use-gha: true 2025-10-10T02:45:07.9193201Z file-suffix: test-default-1-6-linux.rocm.gpu.2_52406492265 2025-10-10T02:45:07.9193483Z s3-bucket: gha-artifacts 2025-10-10T02:45:07.9193662Z env: 2025-10-10T02:45:07.9193811Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:07.9194295Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:07.9194735Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:07.9195180Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:07.9195869Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:07.9196481Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:07.9196682Z AWS_REGION: us-east-1 2025-10-10T02:45:07.9196945Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:07.9197261Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:07.9202139Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:07.9202509Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:07.9202903Z ##[endgroup] 2025-10-10T02:45:07.9272306Z ##[group]Run actions/upload-artifact@v4 2025-10-10T02:45:07.9272542Z with: 2025-10-10T02:45:07.9272835Z name: test-jsons-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip 2025-10-10T02:45:07.9273197Z retention-days: 14 2025-10-10T02:45:07.9273388Z if-no-files-found: warn 2025-10-10T02:45:07.9273580Z path: test/**/*.json 2025-10-10T02:45:07.9273759Z compression-level: 6 2025-10-10T02:45:07.9273935Z overwrite: false 2025-10-10T02:45:07.9274201Z include-hidden-files: false 2025-10-10T02:45:07.9274392Z env: 2025-10-10T02:45:07.9274549Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:07.9274860Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:07.9275364Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:07.9275819Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:07.9276548Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:07.9277207Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:07.9277435Z AWS_REGION: us-east-1 2025-10-10T02:45:07.9277674Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:07.9277969Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:07.9281959Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:07.9282313Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:07.9282685Z ##[endgroup] 2025-10-10T02:45:08.7946185Z With the provided path, there will be 10 files uploaded 2025-10-10T02:45:08.7952338Z Artifact name is valid! 2025-10-10T02:45:08.7952940Z Root directory input is valid! 2025-10-10T02:45:08.9280587Z Beginning upload of artifact content to blob storage 2025-10-10T02:45:09.1626279Z Uploaded bytes 46912 2025-10-10T02:45:09.2094485Z Finished uploading artifact content to blob storage! 2025-10-10T02:45:09.2100051Z SHA256 digest of uploaded artifact zip is 86bac8c67432ed761d04969f94becf3598d9afb909f0cf4e83475e7772086197 2025-10-10T02:45:09.2102423Z Finalizing artifact upload 2025-10-10T02:45:09.3095322Z Artifact test-jsons-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip.zip successfully finalized. Artifact ID 4233090246 2025-10-10T02:45:09.3097561Z Artifact test-jsons-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip has been successfully uploaded! Final size is 46912 bytes. Artifact ID is 4233090246 2025-10-10T02:45:09.3111835Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/18392306192/artifacts/4233090246 2025-10-10T02:45:09.3490487Z ##[group]Run actions/upload-artifact@v4 2025-10-10T02:45:09.3491070Z with: 2025-10-10T02:45:09.3491900Z name: test-reports-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip 2025-10-10T02:45:09.3492971Z retention-days: 14 2025-10-10T02:45:09.3493478Z if-no-files-found: ignore 2025-10-10T02:45:09.3494002Z path: test/**/*.xml test/**/*.csv 2025-10-10T02:45:09.3494839Z compression-level: 6 2025-10-10T02:45:09.3495313Z overwrite: false 2025-10-10T02:45:09.3495770Z include-hidden-files: false 2025-10-10T02:45:09.3496287Z env: 2025-10-10T02:45:09.3496715Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:09.3497475Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:09.3498641Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:09.3499691Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:09.3501452Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:09.3503013Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:09.3503552Z AWS_REGION: us-east-1 2025-10-10T02:45:09.3504134Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:09.3504819Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:09.3514856Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:09.3515711Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:09.3516717Z ##[endgroup] 2025-10-10T02:45:10.2937374Z With the provided path, there will be 103 files uploaded 2025-10-10T02:45:10.2942538Z Artifact name is valid! 2025-10-10T02:45:10.2943158Z Root directory input is valid! 2025-10-10T02:45:10.4479862Z Beginning upload of artifact content to blob storage 2025-10-10T02:45:10.9471649Z Uploaded bytes 436810 2025-10-10T02:45:10.9920409Z Finished uploading artifact content to blob storage! 2025-10-10T02:45:10.9925962Z SHA256 digest of uploaded artifact zip is 8075cb3a4107bfd01972c980453e94eff9fff5de0a4e10979b7e6c5f8fd34907 2025-10-10T02:45:10.9928537Z Finalizing artifact upload 2025-10-10T02:45:11.0936711Z Artifact test-reports-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip.zip successfully finalized. Artifact ID 4233090414 2025-10-10T02:45:11.0939019Z Artifact test-reports-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip has been successfully uploaded! Final size is 436810 bytes. Artifact ID is 4233090414 2025-10-10T02:45:11.0954891Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/18392306192/artifacts/4233090414 2025-10-10T02:45:11.1354424Z ##[group]Run actions/upload-artifact@v4 2025-10-10T02:45:11.1354969Z with: 2025-10-10T02:45:11.1355631Z name: logs-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip 2025-10-10T02:45:11.1356398Z retention-days: 14 2025-10-10T02:45:11.1356839Z if-no-files-found: ignore 2025-10-10T02:45:11.1357330Z path: usage_log.txt test/**/*.log 2025-10-10T02:45:11.1357857Z compression-level: 6 2025-10-10T02:45:11.1358366Z overwrite: false 2025-10-10T02:45:11.1358868Z include-hidden-files: false 2025-10-10T02:45:11.1359404Z env: 2025-10-10T02:45:11.1359824Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:11.1360586Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:11.1361618Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:11.1362629Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:11.1364898Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:11.1366314Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:11.1366792Z AWS_REGION: us-east-1 2025-10-10T02:45:11.1367339Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:11.1367951Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:11.1376743Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:11.1377498Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:11.1378261Z ##[endgroup] 2025-10-10T02:45:12.0749694Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-10-10T02:45:12.0751955Z The least common ancestor is /var/home/pytorchci/actions-runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-10-10T02:45:12.0754310Z With the provided path, there will be 130 files uploaded 2025-10-10T02:45:12.0758123Z Artifact name is valid! 2025-10-10T02:45:12.0758733Z Root directory input is valid! 2025-10-10T02:45:12.2255364Z Beginning upload of artifact content to blob storage 2025-10-10T02:45:12.8889070Z Uploaded bytes 659945 2025-10-10T02:45:12.9353593Z Finished uploading artifact content to blob storage! 2025-10-10T02:45:12.9359901Z SHA256 digest of uploaded artifact zip is 217306660e52b3ef563cd2a77beff67995343bc9e6e3cda3eeea05c796523257 2025-10-10T02:45:12.9363114Z Finalizing artifact upload 2025-10-10T02:45:13.0652014Z Artifact logs-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip.zip successfully finalized. Artifact ID 4233090602 2025-10-10T02:45:13.0654262Z Artifact logs-runattempt1-test-default-1-6-linux.rocm.gpu.2_52406492265.zip has been successfully uploaded! Final size is 659945 bytes. Artifact ID is 4233090602 2025-10-10T02:45:13.0666390Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/18392306192/artifacts/4233090602 2025-10-10T02:45:13.1068600Z ##[group]Run # shellcheck disable=SC2156 2025-10-10T02:45:13.1069318Z # shellcheck disable=SC2156 2025-10-10T02:45:13.1070333Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-10-10T02:45:13.1128583Z shell: /usr/bin/bash -e {0} 2025-10-10T02:45:13.1129122Z env: 2025-10-10T02:45:13.1129549Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:13.1130339Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:13.1131500Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:13.1132581Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:13.1134617Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:13.1136244Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:13.1136785Z AWS_REGION: us-east-1 2025-10-10T02:45:13.1137400Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:13.1138113Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:13.1147840Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:13.1148641Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:13.1149503Z ##[endgroup] 2025-10-10T02:45:13.5228011Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-10-10T02:45:13.5229015Z with: 2025-10-10T02:45:13.5229776Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_upload-benchmark-results 2025-10-10T02:45:13.5230688Z role-duration-seconds: 18000 2025-10-10T02:45:13.5231240Z aws-region: us-east-1 2025-10-10T02:45:13.5231759Z audience: sts.amazonaws.com 2025-10-10T02:45:13.5232306Z env: 2025-10-10T02:45:13.5232719Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:13.5233520Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:13.5235425Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:13.5236526Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:13.5238397Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:13.5240026Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:13.5240654Z AWS_REGION: us-east-1 2025-10-10T02:45:13.5241348Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:13.5242187Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:13.5251859Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:13.5252685Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:13.5253844Z ##[endgroup] 2025-10-10T02:45:13.8424594Z Assuming role with OIDC 2025-10-10T02:45:14.0238020Z Authenticated as assumedRoleId AROAUPVRELQNA5GQHA6IA:GitHubActions 2025-10-10T02:45:14.1072361Z ##[group]Run pytorch/test-infra/.github/actions/upload-benchmark-results@main 2025-10-10T02:45:14.1073344Z with: 2025-10-10T02:45:14.1073886Z benchmark-results-dir: test/test-reports 2025-10-10T02:45:14.1074965Z dry-run: false 2025-10-10T02:45:14.1075512Z schema-version: v3 2025-10-10T02:45:14.1076368Z github-token: *** 2025-10-10T02:45:14.1076881Z env: 2025-10-10T02:45:14.1077370Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:14.1078286Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:14.1079538Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:14.1080711Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:14.1082506Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:14.1084032Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:14.1084561Z AWS_REGION: us-east-1 2025-10-10T02:45:14.1085077Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:14.1085732Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:14.1094302Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:14.1095054Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:14.1095853Z ##[endgroup] 2025-10-10T02:45:14.1155652Z ##[group]Run set -eux 2025-10-10T02:45:14.1156170Z set -eux 2025-10-10T02:45:14.1156585Z  2025-10-10T02:45:14.1156974Z if [[ -n "" ]]; then 2025-10-10T02:45:14.1157438Z  source "" 2025-10-10T02:45:14.1157866Z fi 2025-10-10T02:45:14.1158498Z python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-10-10T02:45:14.1159226Z  2025-10-10T02:45:14.1159613Z DEVICE_NAME="" 2025-10-10T02:45:14.1160082Z DEVICE_TYPE="" 2025-10-10T02:45:14.1160521Z  2025-10-10T02:45:14.1160937Z if command -v nvidia-smi; then 2025-10-10T02:45:14.1161718Z  # NB: I'm using PyTorch here to get the device name, however, it needs to 2025-10-10T02:45:14.1162695Z  # install the correct version of PyTorch manually for now. Any PyTorch 2025-10-10T02:45:14.1163585Z  # version is fine, I just use 2.7.1 to satify PYPIDEP linter 2025-10-10T02:45:14.1164324Z  python3 -mpip install torch==2.7.1 2025-10-10T02:45:14.1164932Z elif command -v rocminfo; then 2025-10-10T02:45:14.1165648Z  # NB: Installing torch on ROCm runner with pip here causes CI to fail 2025-10-10T02:45:14.1166576Z  # with a memoryview is too large error only on MI300 runners. Is pip 2025-10-10T02:45:14.1167486Z  # version on ROCm runner there too old? As a workaround, let's use the 2025-10-10T02:45:14.1168310Z  # GPU device name coming from rocminfo instead 2025-10-10T02:45:14.1169188Z  DEVICE_NAME=rocm 2025-10-10T02:45:14.1170007Z  DEVICE_TYPE=$(rocminfo | grep "Marketing Name" | tail -n1 | awk -F':' '{print $2}' | xargs) 2025-10-10T02:45:14.1170817Z fi 2025-10-10T02:45:14.1171197Z  2025-10-10T02:45:14.1171677Z echo "DEVICE_NAME=$DEVICE_NAME" >> $GITHUB_ENV 2025-10-10T02:45:14.1172368Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-10-10T02:45:14.1222839Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:14.1223509Z env: 2025-10-10T02:45:14.1223921Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:14.1224664Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:14.1225719Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:14.1226954Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:14.1228956Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:14.1230441Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:14.1230940Z AWS_REGION: us-east-1 2025-10-10T02:45:14.1231496Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:14.1232158Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:14.1240908Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:14.1241649Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:14.1242436Z ##[endgroup] 2025-10-10T02:45:14.1331623Z + [[ -n '' ]] 2025-10-10T02:45:14.1332504Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-10-10T02:45:14.4215180Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T02:45:14.5925183Z Requirement already satisfied: boto3==1.35.33 in /var/home/pytorchci/.local/lib/python3.10/site-packages (1.35.33) 2025-10-10T02:45:14.5928784Z Requirement already satisfied: psutil==7.0.0 in /var/home/pytorchci/.local/lib/python3.10/site-packages (7.0.0) 2025-10-10T02:45:14.5934507Z Requirement already satisfied: pynvml==12.0.0 in /var/home/pytorchci/.local/lib/python3.10/site-packages (12.0.0) 2025-10-10T02:45:14.5967918Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3/dist-packages (from boto3==1.35.33) (0.10.0) 2025-10-10T02:45:14.5973854Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /var/home/pytorchci/.local/lib/python3.10/site-packages (from boto3==1.35.33) (0.10.4) 2025-10-10T02:45:14.5977274Z Requirement already satisfied: botocore<1.36.0,>=1.35.33 in /var/home/pytorchci/.local/lib/python3.10/site-packages (from boto3==1.35.33) (1.35.99) 2025-10-10T02:45:14.6125162Z Requirement already satisfied: nvidia-ml-py<13.0.0a0,>=12.0.0 in /var/home/pytorchci/.local/lib/python3.10/site-packages (from pynvml==12.0.0) (12.575.51) 2025-10-10T02:45:14.6172119Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.26.5) 2025-10-10T02:45:14.6177742Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /var/home/pytorchci/.local/lib/python3.10/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.2) 2025-10-10T02:45:14.6219643Z Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.16.0) 2025-10-10T02:45:14.9556786Z + DEVICE_NAME= 2025-10-10T02:45:14.9557346Z + DEVICE_TYPE= 2025-10-10T02:45:14.9557882Z + command -v nvidia-smi 2025-10-10T02:45:14.9558440Z + command -v rocminfo 2025-10-10T02:45:14.9558965Z + DEVICE_NAME=rocm 2025-10-10T02:45:14.9559460Z /opt/rocm/bin/rocminfo 2025-10-10T02:45:14.9574923Z ++ rocminfo 2025-10-10T02:45:14.9579315Z ++ grep 'Marketing Name' 2025-10-10T02:45:14.9582744Z ++ tail -n1 2025-10-10T02:45:14.9589079Z ++ awk -F: '{print $2}' 2025-10-10T02:45:14.9590692Z ++ xargs 2025-10-10T02:45:15.1148287Z + DEVICE_TYPE='AMD Instinct MI250X/MI250' 2025-10-10T02:45:15.1149039Z + echo DEVICE_NAME=rocm 2025-10-10T02:45:15.1149646Z + echo 'DEVICE_TYPE=AMD Instinct MI250X/MI250' 2025-10-10T02:45:15.1206240Z ##[group]Run set -eux 2025-10-10T02:45:15.1206832Z set -eux 2025-10-10T02:45:15.1207327Z  2025-10-10T02:45:15.1207855Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-10-10T02:45:15.1208602Z  echo "Missing github-token input" 2025-10-10T02:45:15.1209265Z  exit 1 2025-10-10T02:45:15.1209730Z fi 2025-10-10T02:45:15.1270077Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:15.1270864Z env: 2025-10-10T02:45:15.1271356Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:15.1272226Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:15.1273766Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:15.1275307Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:15.1278031Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:15.1279766Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:15.1280384Z AWS_REGION: us-east-1 2025-10-10T02:45:15.1281072Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:15.1281859Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:15.1292891Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:15.1293930Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:15.1295043Z DEVICE_NAME: rocm 2025-10-10T02:45:15.1295699Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:15.1296609Z GITHUB_TOKEN: *** 2025-10-10T02:45:15.1297124Z ##[endgroup] 2025-10-10T02:45:15.1397482Z + [[ -z *** ]] 2025-10-10T02:45:15.1486341Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-10-10T02:45:15.1487226Z with: 2025-10-10T02:45:15.1487969Z github-token: *** 2025-10-10T02:45:15.1488498Z env: 2025-10-10T02:45:15.1488976Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:15.1489849Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:15.1491166Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:15.1492558Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:15.1494494Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:15.1496225Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:15.1496829Z AWS_REGION: us-east-1 2025-10-10T02:45:15.1497448Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:15.1498214Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:15.1508460Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:15.1509360Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:15.1510310Z DEVICE_NAME: rocm 2025-10-10T02:45:15.1510864Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:15.1511511Z ##[endgroup] 2025-10-10T02:45:15.1545535Z ##[group]Run set -eux 2025-10-10T02:45:15.1546111Z set -eux 2025-10-10T02:45:15.1546613Z  2025-10-10T02:45:15.1547563Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-10-10T02:45:15.1601660Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:15.1602444Z env: 2025-10-10T02:45:15.1602926Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:15.1603790Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:15.1605088Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:15.1606325Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:15.1608561Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:15.1610299Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:15.1610913Z AWS_REGION: us-east-1 2025-10-10T02:45:15.1611615Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:15.1612510Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:15.1622832Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:15.1623712Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:15.1624675Z DEVICE_NAME: rocm 2025-10-10T02:45:15.1625240Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:15.1626100Z GITHUB_TOKEN: *** 2025-10-10T02:45:15.1626619Z ##[endgroup] 2025-10-10T02:45:15.1727101Z + python3 /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 18392306192 gpud501 2025-10-10T02:45:15.7420914Z setting job-id=52406492265 2025-10-10T02:45:15.7421916Z setting job-name=linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T02:45:15.7649360Z ##[group]Run set -eux 2025-10-10T02:45:15.7649957Z set -eux 2025-10-10T02:45:15.7650440Z  2025-10-10T02:45:15.7650913Z if [[ -n "" ]]; then 2025-10-10T02:45:15.7651476Z  source "" 2025-10-10T02:45:15.7651982Z fi 2025-10-10T02:45:15.7652442Z  2025-10-10T02:45:15.7653237Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-10-10T02:45:15.7654290Z  --schema-version "${SCHEMA_VERSION}" \ 2025-10-10T02:45:15.7655004Z  --repo "${REPO}" \ 2025-10-10T02:45:15.7655624Z  --head-branch "${HEAD_BRANCH}" \ 2025-10-10T02:45:15.7656341Z  --head-sha "${HEAD_SHA}" \ 2025-10-10T02:45:15.7657057Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-10-10T02:45:15.7657827Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-10-10T02:45:15.7658497Z  --job-id "${JOB_ID}" \ 2025-10-10T02:45:15.7659134Z  --job-name "${JOB_NAME}" 2025-10-10T02:45:15.7718067Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:15.7718872Z env: 2025-10-10T02:45:15.7719352Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:15.7720255Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:15.7721553Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:15.7722743Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:15.7724645Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:15.7726396Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:15.7727013Z AWS_REGION: us-east-1 2025-10-10T02:45:15.7727666Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:15.7728438Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:15.7738671Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:15.7739572Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:15.7740521Z DEVICE_NAME: rocm 2025-10-10T02:45:15.7741067Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:15.7741720Z SCHEMA_VERSION: v3 2025-10-10T02:45:15.7742250Z REPO: pytorch/pytorch 2025-10-10T02:45:15.7742794Z HEAD_BRANCH: refs/heads/main 2025-10-10T02:45:15.7743465Z HEAD_SHA: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T02:45:15.7744172Z WORKFLOW_RUN_ID: 18392306192 2025-10-10T02:45:15.7744723Z RUN_ATTEMPT: 1 2025-10-10T02:45:15.7745219Z JOB_ID: 52406492265 2025-10-10T02:45:15.7746021Z JOB_NAME: linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2) 2025-10-10T02:45:15.7746891Z ##[endgroup] 2025-10-10T02:45:15.7842886Z + [[ -n '' ]] 2025-10-10T02:45:15.7846392Z + python3 /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha 344e6365a0068c2d2847fcec0c55dd53291d475e --workflow-id 18392306192 --run-attempt 1 --job-id 52406492265 --job-name 'linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2)' 2025-10-10T02:45:15.8202409Z ##[group]Run set -eux 2025-10-10T02:45:15.8202973Z set -eux 2025-10-10T02:45:15.8203440Z  2025-10-10T02:45:15.8203975Z if [[ -n "" ]]; then 2025-10-10T02:45:15.8204648Z  source "" 2025-10-10T02:45:15.8205221Z fi 2025-10-10T02:45:15.8205748Z  2025-10-10T02:45:15.8206672Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-10-10T02:45:15.8261832Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:15.8262606Z env: 2025-10-10T02:45:15.8263484Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:15.8264377Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:15.8265615Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:15.8266767Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:15.8268642Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:15.8270336Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:15.8270933Z AWS_REGION: us-east-1 2025-10-10T02:45:15.8271573Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:15.8272354Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:15.8283045Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:15.8283980Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:15.8285113Z DEVICE_NAME: rocm 2025-10-10T02:45:15.8285779Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:15.8286543Z ##[endgroup] 2025-10-10T02:45:15.8365505Z + [[ -n '' ]] 2025-10-10T02:45:15.8367170Z + python3 /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-10-10T02:45:17.2817435Z ##[group]Run set -eux 2025-10-10T02:45:17.2818010Z set -eux 2025-10-10T02:45:17.2818481Z  2025-10-10T02:45:17.2819015Z # TODO (huydhn): Implement this part 2025-10-10T02:45:17.2819808Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-10-10T02:45:17.2873405Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:17.2874495Z env: 2025-10-10T02:45:17.2874996Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:17.2875903Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:17.2877225Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:17.2878406Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:17.2880344Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:17.2882067Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:17.2882684Z AWS_REGION: us-east-1 2025-10-10T02:45:17.2883324Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:17.2884133Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:17.2894411Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:17.2895302Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:17.2896246Z DEVICE_NAME: rocm 2025-10-10T02:45:17.2896795Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:17.2897430Z ##[endgroup] 2025-10-10T02:45:17.3021091Z + echo 'dependencies={}' 2025-10-10T02:45:17.3069369Z ##[group]Run set -eux 2025-10-10T02:45:17.3069960Z set -eux 2025-10-10T02:45:17.3070432Z  2025-10-10T02:45:17.3070900Z if [[ -n "" ]]; then 2025-10-10T02:45:17.3071459Z  source "" 2025-10-10T02:45:17.3071971Z fi 2025-10-10T02:45:17.3072417Z  2025-10-10T02:45:17.3072953Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-10-10T02:45:17.3073815Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-10-10T02:45:17.3074903Z  # We don't want the job to fail if the directory doesn't exist 2025-10-10T02:45:17.3075664Z  exit 0 2025-10-10T02:45:17.3076145Z fi 2025-10-10T02:45:17.3076587Z  2025-10-10T02:45:17.3077073Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-10-10T02:45:17.3078288Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-10-10T02:45:17.3079342Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-10-10T02:45:17.3080526Z  --metadata "${BENCHMARK_METADATA}" \ 2025-10-10T02:45:17.3081233Z  --runners "${RUNNER_INFO}" \ 2025-10-10T02:45:17.3081934Z  --dependencies "${DEPENDENCIES}" \ 2025-10-10T02:45:17.3082600Z  --dry-run 2025-10-10T02:45:17.3083118Z else 2025-10-10T02:45:17.3083865Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-10-10T02:45:17.3084903Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-10-10T02:45:17.3085709Z  --metadata "${BENCHMARK_METADATA}" \ 2025-10-10T02:45:17.3086405Z  --runners "${RUNNER_INFO}" \ 2025-10-10T02:45:17.3087096Z  --dependencies "${DEPENDENCIES}" 2025-10-10T02:45:17.3087717Z fi 2025-10-10T02:45:17.3143272Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:17.3144044Z env: 2025-10-10T02:45:17.3144529Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:17.3145400Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:17.3146626Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:17.3147814Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:17.3149661Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:17.3151286Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:17.3151887Z AWS_REGION: us-east-1 2025-10-10T02:45:17.3152539Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:17.3153313Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:17.3163188Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:17.3164044Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:17.3164969Z DEVICE_NAME: rocm 2025-10-10T02:45:17.3165541Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:17.3166224Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-10-10T02:45:17.3166847Z DRY_RUN: false 2025-10-10T02:45:17.3169057Z BENCHMARK_METADATA: {"timestamp": 1760064315, "schema_version": "v3", "name": "linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "344e6365a0068c2d2847fcec0c55dd53291d475e", "workflow_id": 18392306192, "run_attempt": 1, "job_id": 52406492265} 2025-10-10T02:45:17.3172121Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 128, "avail_mem_in_gb": 1007, "extra_info": {"hostname": "gpud501.jax.cs.cpe.ice.amd.com"}, "name": "rocm", "type": "AMD Instinct MI250X/MI250"}] 2025-10-10T02:45:17.3173486Z DEPENDENCIES: {} 2025-10-10T02:45:17.3173986Z ##[endgroup] 2025-10-10T02:45:17.3267894Z + [[ -n '' ]] 2025-10-10T02:45:17.3268370Z + [[ ! -d test/test-reports ]] 2025-10-10T02:45:17.3268906Z + [[ false == \t\r\u\e ]] 2025-10-10T02:45:17.3273568Z + python3 /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1760064315, "schema_version": "v3", "name": "linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "344e6365a0068c2d2847fcec0c55dd53291d475e", "workflow_id": 18392306192, "run_attempt": 1, "job_id": 52406492265}' --runners '[{"cpu_info": "x86_64", "cpu_count": 128, "avail_mem_in_gb": 1007, "extra_info": {"hostname": "gpud501.jax.cs.cpe.ice.amd.com"}, "name": "rocm", "type": "AMD Instinct MI250X/MI250"}]' --dependencies '{}' 2025-10-10T02:45:17.4980412Z /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'test_public_bindings'}, {'test_file': 'inductor/test_aot_inductor'}, {'test_file': 'inductor/test_torchinductor'}, {'test_file': 'inductor/test_triton_kernels'}, {'test_file': 'inductor/test_triton_heuristics'}, {'test_file': 'inductor/test_codecache'}, {'test_file': 'inductor/test_profiler'}, {'test_file': 'dynamo/test_structured_trace'}, {'test_file': 'inductor/test_flex_attention'}, {'test_file': 'inductor/test_torchinductor_strided_blocks'}, {'test_file': 'inductor/test_fxir_backend'}, {'test_file': 'inductor/test_best_config'}, {'test_file': 'inductor/test_torchinductor_opinfo'}, {'test_file': 'inductor/test_triton_cpu_backend'}, {'test_file': 'inductor/test_static_cuda_launcher'}, {'test_file': 'inductor/test_cooperative_reductions'}, {'test_file': 'dynamo/test_callback'}, {'test_file': 'inductor/test_kernel_benchmark'}, {'test_file': 'inductor/test_cuda_repro'}, {'test_file': 'inductor/test_torchinductor_dynamic_shapes'}, {'test_file': 'inductor/test_async_compile'}, {'test_file': 'inductor/test_fp8'}, {'test_file': 'inductor/test_inplace_padding'}, {'test_file': 'inductor/test_halide'}, {'test_file': 'inductor/test_padding'}, {'test_file': 'inductor/test_coordinate_descent_tuner'}, {'test_file': 'inductor/test_torchinductor_codegen_dynamic_shapes'}, {'test_file': 'inductor/test_select_algorithm'}, {'test_file': 'inductor/test_analysis'}, {'test_file': 'inductor/test_inductor_scheduler'}, {'test_file': 'inductor/test_utils'}, {'test_file': 'dynamo/test_package'}, {'test_file': 'dynamo/test_compile'}, {'test_file': 'inductor/test_triton_syntax'}, {'test_file': 'inductor/test_codegen_triton'}, {'test_file': 'inductor/test_snode_runtime'}, {'test_file': 'inductor/test_template_heuristics_registry'}, {'test_file': 'inductor/test_triton_extension_backend'}, {'test_file': 'inductor/test_triton_wrapper'}, {'test_file': 'dynamo/test_einops'}, {'test_file': 'inductor/test_extension_backend'}, {'test_file': 'inductor/test_pattern_matcher'}, {'test_file': 'dynamo/test_functions'}, {'test_file': 'dynamo/test_dynamic_shapes'}, {'test_file': 'export/test_serialize'}, {'test_file': 'dynamo/test_utils'}, {'test_file': 'dynamo/test_backends'}, {'test_file': 'dynamo/test_ctx_manager'}, {'test_file': 'inductor/test_minifier'}, {'test_file': 'inductor/test_perf'}, {'test_file': 'inductor/test_compiled_autograd'}, {'test_file': 'inductor/test_loop_ordering'}, {'test_file': 'inductor/test_deterministic'}, {'test_file': 'dynamo/test_decorators'}, {'test_file': 'inductor/test_cudacodecache'}, {'test_file': 'test_model_exports_to_core_aten'}, {'test_file': 'inductor/test_cpu_select_algorithm'}, {'test_file': 'inductor/test_cuda_select_algorithm'}, {'test_file': 'inductor/test_compile_subprocess'}, {'test_file': 'dynamo/test_activation_checkpointing'}, {'test_file': 'inductor/test_multi_kernel'}, {'test_file': 'export/test_export'}, {'test_file': 'inductor/test_custom_post_grad_passes'}, {'test_file': 'inductor/test_aot_inductor_arrayref'}, {'test_file': 'inductor/test_unbacked_symints'}, {'test_file': 'inductor/test_cpp_wrapper_hipify'}, {'test_file': 'inductor/test_compiled_optimizers'}, {'test_file': 'inductor/test_memory'}, {'test_file': 'dynamo/test_fx_graph_runnable'}, {'test_file': 'export/test_cpp_serdes'}, {'test_file': 'export/test_export_training_ir_to_run_decomp'}, {'test_file': 'export/test_retraceability'}, {'test_file': 'export/test_serdes'}, {'test_file': 'dynamo/test_compiler_bisector'}, {'test_file': 'inductor/test_decompose_mem_bound_mm'}, {'test_file': 'inductor/test_scatter_optimization'}, {'test_file': 'inductor/test_aot_inductor_package'}, {'test_file': 'inductor/test_mkldnn_pattern_matcher'}, {'test_file': 'dynamo/test_logging'}, {'test_file': 'inductor/test_cpu_cpp_wrapper'}, {'test_file': 'inductor/test_flex_decoding'}, {'test_file': 'dynamo/test_aot_autograd_cache'}, {'test_file': 'inductor/test_cpu_repro'}, {'test_file': 'inductor/test_ordered_set'}, {'test_file': 'dynamo/test_autograd_function'}, {'test_file': 'inductor/test_inplacing_pass'}, {'test_file': 'inductor/test_benchmarking'}, {'test_file': 'dynamo/test_guard_serialization'}, {'test_file': 'dynamo/test_recompile_ux'}, {'test_file': 'inductor/test_group_batch_fusion'}, {'test_file': 'dynamo/test_unspec'}, {'test_file': 'dynamo/test_modes'}, {'test_file': 'inductor/test_control_flow'}, {'test_file': 'inductor/test_provenance_tracing'}, {'test_file': 'inductor/test_inductor_freezing'}, {'test_file': 'inductor/test_distributed_patterns'}, {'test_file': 'dynamo/test_repros'}, {'test_file': 'inductor/test_cutlass_evt'}, {'test_file': 'dynamo/test_aot_autograd'}, {'test_file': 'dynamo/test_aot_compile'}, {'test_file': 'inductor/test_foreach'}, {'test_file': 'inductor/test_indexing'}, {'test_file': 'export/test_export_with_inline_and_install'}, {'test_file': 'inductor/test_metrics'}, {'test_file': 'dynamo/test_fake_distributed'}, {'test_file': 'export/test_export_strict'}, {'test_file': 'export/test_strict_export_v2'}, {'test_file': 'export/test_nativert'}, {'test_file': 'inductor/test_memory_planning'}, {'test_file': 'inductor/test_remote_cache'}, {'test_file': 'inductor/test_pad_mm'}, {'test_file': 'dynamo/test_after_aot'}, {'test_file': 'dynamo/test_backward_higher_order_ops'}, {'test_file': 'dynamo/test_base_hop'}, {'test_file': 'dynamo/test_base_output'}, {'test_file': 'dynamo/test_buffers_override'}, {'test_file': 'dynamo/test_bytecode_utils'}, {'test_file': 'dynamo/test_comptime'}, {'test_file': 'dynamo/test_config'}, {'test_file': 'dynamo/test_cudagraphs'}, {'test_file': 'dynamo/test_cudagraphs_expandable_segments'}, {'test_file': 'dynamo/test_debug_utils'}, {'test_file': 'dynamo/test_deque_reconstruct'}, {'test_file': 'dynamo/test_deviceguard'}, {'test_file': 'dynamo/test_dicts'}, {'test_file': 'dynamo/test_error_messages'}, {'test_file': 'dynamo/test_exc'}, {'test_file': 'dynamo/test_exceptions'}, {'test_file': 'dynamo/test_export'}, {'test_file': 'dynamo/test_export_mutations'}, {'test_file': 'dynamo/test_flat_apply'}, {'test_file': 'dynamo/test_frame_init'}, {'test_file': 'dynamo/test_fx_annotate'}, {'test_file': 'dynamo/test_fx_passes_pre_grad'}, {'test_file': 'dynamo/test_generator'}, {'test_file': 'dynamo/test_global'}, {'test_file': 'dynamo/test_graph_deduplication'}, {'test_file': 'dynamo/test_graph_region_tracker'}, {'test_file': 'dynamo/test_guard_manager'}, {'test_file': 'dynamo/test_higher_order_ops'}, {'test_file': 'dynamo/test_hooks'}, {'test_file': 'dynamo/test_inline_and_install'}, {'test_file': 'dynamo/test_input_attr_tracking'}, {'test_file': 'dynamo/test_install_free_tensors'}, {'test_file': 'dynamo/test_interop'}, {'test_file': 'dynamo/test_list'}, {'test_file': 'dynamo/test_metrics_context'}, {'test_file': 'dynamo/test_minifier'}, {'test_file': 'dynamo/test_misc'}, {'test_file': 'dynamo/test_model_output'}, {'test_file': 'dynamo/test_modules'}, {'test_file': 'dynamo/test_nested_graph_breaks'}, {'test_file': 'dynamo/test_nops'}, {'test_file': 'dynamo/test_optimizers'}, {'test_file': 'dynamo/test_pgo'}, {'test_file': 'dynamo/test_pre_dispatch'}, {'test_file': 'dynamo/test_precompile_context'}, {'test_file': 'dynamo/test_profiler'}, {'test_file': 'dynamo/test_python_autograd'}, {'test_file': 'dynamo/test_python_dispatcher'}, {'test_file': 'dynamo/test_recompiles'}, {'test_file': 'dynamo/test_reconstruct'}, {'test_file': 'dynamo/test_reorder_logs'}, {'test_file': 'dynamo/test_resume'}, {'test_file': 'dynamo/test_sdpa'}, {'test_file': 'dynamo/test_sets'}, {'test_file': 'dynamo/test_skip_guard_eval_unsafe'}, {'test_file': 'dynamo/test_skip_non_tensor'}, {'test_file': 'dynamo/test_sources'}, {'test_file': 'dynamo/test_subclasses'}, {'test_file': 'dynamo/test_subgraphs'}, {'test_file': 'dynamo/test_torchrec'}, {'test_file': 'dynamo/test_trace_rules'}, {'test_file': 'dynamo/test_unittest'}, {'test_file': 'dynamo/test_verify_correctness'}, {'test_file': 'dynamo/test_view'}, {'test_file': 'export/test_converter'}, {'test_file': 'export/test_db'}, {'test_file': 'export/test_draft_export'}, {'test_file': 'export/test_dynamic_shapes'}, {'test_file': 'export/test_experimental'}, {'test_file': 'export/test_export_opinfo'}, {'test_file': 'export/test_functionalized_assertions'}, {'test_file': 'export/test_hop'}, {'test_file': 'export/test_lift_unlift'}, {'test_file': 'export/test_package'}, {'test_file': 'export/test_pass_infra'}, {'test_file': 'export/test_passes'}, {'test_file': 'export/test_schema'}, {'test_file': 'export/test_sparse'}, {'test_file': 'export/test_swap'}, {'test_file': 'export/test_tools'}, {'test_file': 'export/test_torchbind'}, {'test_file': 'export/test_tree_utils'}, {'test_file': 'export/test_unflatten'}, {'test_file': 'export/test_unflatten_training_ir'}, {'test_file': 'export/test_upgrader'}, {'test_file': 'export/test_verifier'}, {'test_file': 'inductor/test_alignment'}, {'test_file': 'inductor/test_aot_inductor_custom_ops'}, {'test_file': 'inductor/test_aot_inductor_utils'}, {'test_file': 'inductor/test_aot_inductor_windows'}, {'test_file': 'inductor/test_augmented_graph_helper'}, {'test_file': 'inductor/test_auto_functionalize'}, {'test_file': 'inductor/test_autoheuristic'}, {'test_file': 'inductor/test_b2b_gemm'}, {'test_file': 'inductor/test_benchmark_fusion'}, {'test_file': 'inductor/test_binary_folding'}, {'test_file': 'inductor/test_block_analysis'}, {'test_file': 'inductor/test_cache'}, {'test_file': 'inductor/test_caching'}, {'test_file': 'inductor/test_ck_backend'}, {'test_file': 'inductor/test_combo_kernels'}, {'test_file': 'inductor/test_compile'}, {'test_file': 'inductor/test_compile_worker'}, {'test_file': 'inductor/test_config'}, {'test_file': 'inductor/test_control_deps'}, {'test_file': 'inductor/test_cudagraph_trees'}, {'test_file': 'inductor/test_cudagraph_trees_expandable_segments'}, {'test_file': 'inductor/test_custom_lowering'}, {'test_file': 'inductor/test_custom_partitioner_fn'}, {'test_file': 'inductor/test_cutedsl_template'}, {'test_file': 'inductor/test_cutlass_backend'}, {'test_file': 'inductor/test_debug_trace'}, {'test_file': 'inductor/test_dependencies'}, {'test_file': 'inductor/test_device_assert'}, {'test_file': 'inductor/test_efficient_conv_bn_eval'}, {'test_file': 'inductor/test_external_callables'}, {'test_file': 'inductor/test_fused_attention'}, {'test_file': 'inductor/test_fuzzer'}, {'test_file': 'inductor/test_fx_fusion'}, {'test_file': 'inductor/test_gpu_cpp_wrapper'}, {'test_file': 'inductor/test_graph_transform_observer'}, {'test_file': 'inductor/test_helion_kernels'}, {'test_file': 'inductor/test_inductor_annotations'}, {'test_file': 'inductor/test_inductor_utils'}, {'test_file': 'inductor/test_kernel_optimization'}, {'test_file': 'inductor/test_layout_optim'}, {'test_file': 'inductor/test_mem_estimation'}, {'test_file': 'inductor/test_minifier_isolate'}, {'test_file': 'inductor/test_minifier_utils'}, {'test_file': 'inductor/test_mmdecomp'}, {'test_file': 'inductor/test_move_constructors_to_cuda'}, {'test_file': 'inductor/test_mps_basic'}, {'test_file': 'inductor/test_needs_exact_strides'}, {'test_file': 'inductor/test_online_softmax'}, {'test_file': 'inductor/test_op_completeness'}, {'test_file': 'inductor/test_op_dtype_prop'}, {'test_file': 'inductor/test_quantization'}, {'test_file': 'inductor/test_segmented_tree'}, {'test_file': 'inductor/test_smoke'}, {'test_file': 'inductor/test_split_cat_fx_aten_passes'}, {'test_file': 'inductor/test_split_cat_fx_passes'}, {'test_file': 'inductor/test_subgraph_choice'}, {'test_file': 'inductor/test_torchbind'}, {'test_file': 'inductor/test_torchinductor_codegen_config_overrides'}, {'test_file': 'inductor/test_xpu_basic'}, {'test_file': 'profiler/test_profiler'}, {'test_file': 'test_torch'}, {'test_file': 'test_ops'}, {'test_file': 'test_fx'}, {'test_file': 'test_matmul_cuda'}, {'test_file': 'test_modules'}, {'test_file': 'test_cuda'}, {'test_file': 'test_cuda_expandable_segments'}, {'test_file': 'test_meta'}, {'test_file': 'profiler/test_memory_profiler'}, {'test_file': 'test_testing'}, {'test_file': 'test_decomp'}, {'test_file': 'functorch/test_ops'}, {'test_file': 'test_autograd'}, {'test_file': 'functorch/test_dims'}, {'test_file': 'nn/test_parametrization'}, {'test_file': 'profiler/test_kineto'}, {'test_file': 'test_ci_sanity_check_fail'}, {'test_file': 'test_cuda_multigpu'}, {'test_file': 'test_jit_fuser_te'}, {'test_file': 'test_mobile_optimizer'}, {'test_file': 'test_nestedtensor'}, {'test_file': 'test_type_hints'}, {'test_file': 'test_linalg'}, {'test_file': 'test_jit'}, {'test_file': 'profiler/test_torch_tidy'}, {'test_file': 'test_overrides'}, {'test_file': 'nn/test_multihead_attention'}, {'test_file': 'higher_order_ops/test_invoke_subgraph'}, {'test_file': 'test_hub'}, {'test_file': 'functorch/test_aotdispatch'}, {'test_file': 'test_unary_ufuncs'}, {'test_file': 'distributions/test_distributions'}, {'test_file': 'profiler/test_execution_trace'}, {'test_file': 'profiler/test_record_function'}, {'test_file': 'test_multiprocessing_spawn'}, {'test_file': 'test_sparse_semi_structured'}, {'test_file': 'doctests'}, {'test_file': 'test_autoload_enable'}, {'test_file': 'functorch/test_aot_joint_with_descriptors'}, {'test_file': 'test_reductions'}, {'test_file': 'test_fake_tensor'}, {'test_file': 'functorch/test_eager_transforms'}, {'test_file': 'functorch/test_vmap'}, {'test_file': 'test_nn'}, {'test_file': 'test_numba_integration'}, {'test_file': 'torch_np/numpy_tests/core/test_multiarray'}, {'test_file': 'test_privateuseone_python_backend'}, {'test_file': 'test_spectral_ops'}, {'test_file': 'test_numa_binding'}, {'test_file': 'test_torchfuzz_repros'}, {'test_file': 'test_transformers'}, {'test_file': 'test_maskedtensor'}, {'test_file': 'torch_np/test_ndarray_methods'}, {'test_file': 'backends/xeon/test_launch'}, {'test_file': 'benchmark_utils/test_benchmark_utils'}, {'test_file': 'cpp_extensions/libtorch_agnostic_extension/test/test_libtorch_agnostic'}, {'test_file': 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic'}, {'test_file': 'cpp_extensions/torch_stable_test_extension/torch_stable_test/test_torch_stable'}, {'test_file': 'distributions/test_constraints'}, {'test_file': 'functorch/dim/test_getsetitem'}, {'test_file': 'functorch/dim/test_split'}, {'test_file': 'functorch/test_ac'}, {'test_file': 'functorch/test_ac_knapsack'}, {'test_file': 'functorch/test_ac_logging'}, {'test_file': 'functorch/test_control_flow'}, {'test_file': 'functorch/test_logging'}, {'test_file': 'functorch/test_memory_efficient_fusion'}, {'test_file': 'functorch/test_minifier'}, {'test_file': 'functorch/test_parsing'}, {'test_file': 'functorch/test_rearrange'}, {'test_file': 'functorch/test_vmap_registrations'}, {'test_file': 'higher_order_ops/test_invoke_quant'}, {'test_file': 'higher_order_ops/test_local_map'}, {'test_file': 'higher_order_ops/test_with_effects'}, {'test_file': 'lazy/test_bindings'}, {'test_file': 'lazy/test_debug_util'}, {'test_file': 'lazy/test_functionalization'}, {'test_file': 'lazy/test_generator'}, {'test_file': 'lazy/test_reuse_ir'}, {'test_file': 'lazy/test_step_closures'}, {'test_file': 'lazy/test_ts_opinfo'}, {'test_file': 'nn/test_convolution'}, {'test_file': 'nn/test_dropout'}, {'test_file': 'nn/test_embedding'}, {'test_file': 'nn/test_init'}, {'test_file': 'nn/test_lazy_modules'}, {'test_file': 'nn/test_load_state_dict'}, {'test_file': 'nn/test_module_hooks'}, {'test_file': 'nn/test_packed_sequence'}, {'test_file': 'nn/test_pooling'}, {'test_file': 'nn/test_pruning'}, {'test_file': 'optim/test_lrscheduler'}, {'test_file': 'optim/test_optim'}, {'test_file': 'optim/test_swa_utils'}, {'test_file': 'profiler/test_cpp_thread'}, {'test_file': 'profiler/test_profiler_tree'}, {'test_file': 'profiler/test_python_tracer'}, {'test_file': 'test_accelerator'}, {'test_file': 'test_ao_sparsity'}, {'test_file': 'test_appending_byte_serializer'}, {'test_file': 'test_autocast'}, {'test_file': 'test_autograd_fallback'}, {'test_file': 'test_autoload'}, {'test_file': 'test_autoload_disable'}, {'test_file': 'test_binary_ufuncs'}, {'test_file': 'test_bundled_inputs'}, {'test_file': 'test_comparison_utils'}, {'test_file': 'test_compile_benchmark_util'}, {'test_file': 'test_complex'}, {'test_file': 'test_content_store'}, {'test_file': 'test_cpp_api_parity'}, {'test_file': 'test_cpp_extensions_aot_ninja'}, {'test_file': 'test_cpp_extensions_aot_no_ninja'}, {'test_file': 'test_cpp_extensions_jit'}, {'test_file': 'test_cpp_extensions_mtia_backend'}, {'test_file': 'test_cpp_extensions_stream_and_event'}, {'test_file': 'test_cuda_primary_ctx'}, {'test_file': 'test_cuda_sanitizer'}, {'test_file': 'test_cuda_trace'}, {'test_file': 'test_custom_ops'}, {'test_file': 'test_dataloader'}, {'test_file': 'test_datapipe'}, {'test_file': 'test_dispatch'}, {'test_file': 'test_dlpack'}, {'test_file': 'test_dynamic_shapes'}, {'test_file': 'test_expanded_weights'}, {'test_file': 'test_extension_utils'}, {'test_file': 'test_file_check'}, {'test_file': 'test_flop_counter'}, {'test_file': 'test_foreach'}, {'test_file': 'test_function_schema'}, {'test_file': 'test_functional_autograd_benchmark'}, {'test_file': 'test_functional_optim'}, {'test_file': 'test_functionalization'}, {'test_file': 'test_functionalization_of_rng_ops'}, {'test_file': 'test_futures'}, {'test_file': 'test_fx_experimental'}, {'test_file': 'test_fx_passes'}, {'test_file': 'test_fx_reinplace_pass'}, {'test_file': 'test_hop_infra'}, {'test_file': 'test_import_stats'}, {'test_file': 'test_indexing'}, {'test_file': 'test_itt'}, {'test_file': 'test_jit_autocast'}, {'test_file': 'test_jit_disabled'}, {'test_file': 'test_jit_llga_fuser'}, {'test_file': 'test_jiterator'}, {'test_file': 'test_legacy_vmap'}, {'test_file': 'test_license'}, {'test_file': 'test_logging'}, {'test_file': 'test_masked'}, {'test_file': 'test_mkl_verbose'}, {'test_file': 'test_mkldnn'}, {'test_file': 'test_mkldnn_fusion'}, {'test_file': 'test_mkldnn_verbose'}, {'test_file': 'test_module_tracker'}, {'test_file': 'test_monitor'}, {'test_file': 'test_multiprocessing'}, {'test_file': 'test_namedtensor'}, {'test_file': 'test_namedtuple_return_api'}, {'test_file': 'test_native_functions'}, {'test_file': 'test_native_mha'}, {'test_file': 'test_numpy_interop'}, {'test_file': 'test_opaque_obj'}, {'test_file': 'test_openmp'}, {'test_file': 'test_ops_fwd_gradients'}, {'test_file': 'test_ops_gradients'}, {'test_file': 'test_ops_jit'}, {'test_file': 'test_optim'}, {'test_file': 'test_out_dtype_op'}, {'test_file': 'test_package'}, {'test_file': 'test_per_overload_api'}, {'test_file': 'test_prims'}, {'test_file': 'test_proxy_tensor'}, {'test_file': 'test_pruning_op'}, {'test_file': 'test_python_dispatch'}, {'test_file': 'test_pytree'}, {'test_file': 'test_rename_privateuse1_to_existing_device'}, {'test_file': 'test_scaled_matmul_cuda'}, {'test_file': 'test_scatter_gather_ops'}, {'test_file': 'test_schema_check'}, {'test_file': 'test_segment_reductions'}, {'test_file': 'test_serialization'}, {'test_file': 'test_set_default_mobile_cpu_allocator'}, {'test_file': 'test_shape_ops'}, {'test_file': 'test_show_pickle'}, {'test_file': 'test_sort_and_select'}, {'test_file': 'test_sparse'}, {'test_file': 'test_sparse_csr'}, {'test_file': 'test_stateless'}, {'test_file': 'test_subclass'}, {'test_file': 'test_sympy_utils'}, {'test_file': 'test_tensor_creation_ops'}, {'test_file': 'test_tensorboard'}, {'test_file': 'test_tensorexpr'}, {'test_file': 'test_tensorexpr_pybind'}, {'test_file': 'test_type_info'}, {'test_file': 'test_type_promotion'}, {'test_file': 'test_typing'}, {'test_file': 'test_utils'}, {'test_file': 'test_utils_config_module'}, {'test_file': 'test_utils_filelock'}, {'test_file': 'test_view_ops'}, {'test_file': 'test_vulkan'}, {'test_file': 'test_weak'}, {'test_file': 'test_xnnpack_integration'}, {'test_file': 'torch_np/numpy_tests/core/test_dlpack'}, {'test_file': 'torch_np/numpy_tests/core/test_dtype'}, {'test_file': 'torch_np/numpy_tests/core/test_einsum'}, {'test_file': 'torch_np/numpy_tests/core/test_getlimits'}, {'test_file': 'torch_np/numpy_tests/core/test_indexing'}, {'test_file': 'torch_np/numpy_tests/core/test_numeric'}, {'test_file': 'torch_np/numpy_tests/core/test_numerictypes'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_ctors'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_methods'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarinherit'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarmath'}, {'test_file': 'torch_np/numpy_tests/core/test_shape_base'}, {'test_file': 'torch_np/numpy_tests/fft/test_helper'}, {'test_file': 'torch_np/numpy_tests/fft/test_pocketfft'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraypad'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraysetops'}, {'test_file': 'torch_np/numpy_tests/lib/test_function_base'}, {'test_file': 'torch_np/numpy_tests/lib/test_histograms'}, {'test_file': 'torch_np/numpy_tests/lib/test_index_tricks'}, {'test_file': 'torch_np/numpy_tests/lib/test_shape_base_'}, {'test_file': 'torch_np/numpy_tests/lib/test_twodim_base'}, {'test_file': 'torch_np/numpy_tests/lib/test_type_check'}, {'test_file': 'torch_np/numpy_tests/linalg/test_linalg'}, {'test_file': 'torch_np/test_basic'}, {'test_file': 'torch_np/test_binary_ufuncs'}, {'test_file': 'torch_np/test_dtype'}, {'test_file': 'torch_np/test_function_base'}, {'test_file': 'torch_np/test_indexing'}, {'test_file': 'torch_np/test_nep50_examples'}, {'test_file': 'torch_np/test_random'}, {'test_file': 'torch_np/test_reductions'}, {'test_file': 'torch_np/test_scalars_0D_arrays'}, {'test_file': 'torch_np/test_ufuncs_basic'}, {'test_file': 'torch_np/test_unary_ufuncs'}, {'test_file': 'typing/test_python_operators'}, {'test_file': 'xpu/test_conv'}, {'test_file': 'xpu/test_fusion'}, {'test_file': 'xpu/test_gemm'}], 'excluded': []} from test/test-reports/td_exclusions-e9948be9fa8ef15fc682.json is not a benchmark record, skipping 2025-10-10T02:45:17.5088514Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:45:17.5091591Z /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'lazy/test_ts_opinfo'}], 'excluded': []} from test/test-reports/td_exclusions-7012a63cf4d0be7e2b9b.json is not a benchmark record, skipping 2025-10-10T02:45:17.5094200Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:45:17.5100255Z /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'cpp/Dict_test'}, {'test_file': 'cpp/Dimname_test'}, {'test_file': 'cpp/NamedTensor_test'}, {'test_file': 'cpp/apply_utils_test'}, {'test_file': 'cpp/atest'}, {'test_file': 'cpp/basic'}, {'test_file': 'cpp/broadcast_test'}, {'test_file': 'cpp/cpu_generator_test'}, {'test_file': 'cpp/dlconvertor_test'}, {'test_file': 'cpp/extension_backend_test'}, {'test_file': 'cpp/lazy_tensor_test'}, {'test_file': 'cpp/legacy_vmap_test'}, {'test_file': 'cpp/native_test'}, {'test_file': 'cpp/operators_test'}, {'test_file': 'cpp/scalar_tensor_test'}, {'test_file': 'cpp/scalar_test'}, {'test_file': 'cpp/tensor_iterator_test'}, {'test_file': 'cpp/undefined_tensor_test'}, {'test_file': 'cpp/wrapdim_test'}], 'excluded': []} from test/test-reports/td_exclusions-408b5a05abf2a9ed875c.json is not a benchmark record, skipping 2025-10-10T02:45:17.5106001Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:45:17.5108738Z /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'cpp/test_api'}], 'excluded': []} from test/test-reports/td_exclusions-a909b1c63f13e2e38084.json is not a benchmark record, skipping 2025-10-10T02:45:17.5111257Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:45:17.5113812Z /var/home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'cpp/static_runtime_test'}], 'excluded': []} from test/test-reports/td_exclusions-a32c4dc419ef2092a299.json is not a benchmark record, skipping 2025-10-10T02:45:17.5116676Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:45:17.5173529Z Prepare all required actions 2025-10-10T02:45:17.5173928Z Getting action download info 2025-10-10T02:45:17.5197187Z ##[group]Run ./.github/actions/teardown-rocm 2025-10-10T02:45:17.5197475Z env: 2025-10-10T02:45:17.5197677Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:17.5198043Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:17.5198547Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:17.5199042Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:17.5199792Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:17.5200480Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:17.5200729Z AWS_REGION: us-east-1 2025-10-10T02:45:17.5201003Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:17.5201355Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:17.5205327Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:17.5205708Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:17.5206104Z DEVICE_NAME: rocm 2025-10-10T02:45:17.5206361Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:17.5206639Z ##[endgroup] 2025-10-10T02:45:17.5220158Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T02:45:17.5220598Z # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T02:45:17.5220953Z # shellcheck disable=SC2046 2025-10-10T02:45:17.5221256Z docker stop $(docker ps -q) || true 2025-10-10T02:45:17.5221550Z # Prune all stopped containers. 2025-10-10T02:45:17.5221837Z docker container prune -f 2025-10-10T02:45:17.5242421Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:17.5242750Z env: 2025-10-10T02:45:17.5242961Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:17.5243340Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:17.5243853Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:17.5244325Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:17.5245058Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:17.5245745Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:17.5246001Z AWS_REGION: us-east-1 2025-10-10T02:45:17.5246275Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:17.5246594Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:17.5250576Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:17.5250941Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:17.5251327Z DEVICE_NAME: rocm 2025-10-10T02:45:17.5251564Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:17.5251942Z ##[endgroup] 2025-10-10T02:45:28.2591214Z 496f06a5d8bf 2025-10-10T02:45:36.3038045Z Deleted Containers: 2025-10-10T02:45:36.3038907Z 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:36.3039538Z 2025-10-10T02:45:36.3039784Z Total reclaimed space: 10.59GB 2025-10-10T02:45:36.3145838Z Prepare all required actions 2025-10-10T02:45:36.3210220Z ##[group]Run ./.github/actions/diskspace-cleanup 2025-10-10T02:45:36.3210911Z with: 2025-10-10T02:45:36.3211387Z diskspace-cutoff: 70 2025-10-10T02:45:36.3211890Z env: 2025-10-10T02:45:36.3212374Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:36.3213227Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:36.3214542Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:36.3216004Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:36.3218350Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:36.3220077Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:36.3220686Z AWS_REGION: us-east-1 2025-10-10T02:45:36.3221355Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:36.3222289Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:36.3232938Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:36.3233826Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:36.3234934Z DEVICE_NAME: rocm 2025-10-10T02:45:36.3235493Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:36.3236133Z ##[endgroup] 2025-10-10T02:45:36.3270435Z ##[group]Run set -ex 2025-10-10T02:45:36.3271028Z set -ex 2025-10-10T02:45:36.3271567Z diskspace_cutoff=70 2025-10-10T02:45:36.3272382Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-10-10T02:45:36.3273235Z if [ ! -d "$docker_root_dir" ]; then 2025-10-10T02:45:36.3274688Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-10-10T02:45:36.3275709Z  exit 0 2025-10-10T02:45:36.3276212Z fi 2025-10-10T02:45:36.3277096Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-10-10T02:45:36.3278861Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-10-10T02:45:36.3280381Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-10-10T02:45:36.3281174Z  docker system prune -af 2025-10-10T02:45:36.3282382Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-10-10T02:45:36.3283783Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-10-10T02:45:36.3285214Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2025-10-10T02:45:36.3286493Z  echo "$msg" 2025-10-10T02:45:36.3287175Z  exit 1 2025-10-10T02:45:36.3287799Z  else 2025-10-10T02:45:36.3288430Z  difference=$((diskspace - diskspace_new)) 2025-10-10T02:45:36.3289272Z  echo "Diskspace saved: $difference percent" 2025-10-10T02:45:36.3289979Z  fi 2025-10-10T02:45:36.3290449Z fi 2025-10-10T02:45:36.3348795Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:45:36.3349580Z env: 2025-10-10T02:45:36.3350063Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:45:36.3350926Z RUNNER_ARTIFACT_DIR: /var/home/pytorchci/actions-runner/_work/_temp/artifacts 2025-10-10T02:45:36.3352184Z RUNNER_TEST_RESULTS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/test-results 2025-10-10T02:45:36.3353380Z RUNNER_DOCS_DIR: /var/home/pytorchci/actions-runner/_work/_temp/docs 2025-10-10T02:45:36.3355993Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 110 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-10-10T02:45:36.3357685Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T02:45:36.3358291Z AWS_REGION: us-east-1 2025-10-10T02:45:36.3358940Z AWS_ACCESS_KEY_ID: *** 2025-10-10T02:45:36.3359719Z AWS_SECRET_ACCESS_KEY: *** 2025-10-10T02:45:36.3369177Z AWS_SESSION_TOKEN: *** 2025-10-10T02:45:36.3369937Z CONTAINER_NAME: 496f06a5d8bfd5cdf0e002901447da33847120f97fdb68bcc2f188211daa0192 2025-10-10T02:45:36.3370725Z DEVICE_NAME: rocm 2025-10-10T02:45:36.3371191Z DEVICE_TYPE: AMD Instinct MI250X/MI250 2025-10-10T02:45:36.3371719Z ##[endgroup] 2025-10-10T02:45:36.3460199Z + diskspace_cutoff=70 2025-10-10T02:45:36.3469112Z ++ docker info -f '{{.DockerRootDir}}' 2025-10-10T02:45:36.4426814Z + docker_root_dir=/media/4TB/docker-rootless 2025-10-10T02:45:36.4427698Z + '[' '!' -d /media/4TB/docker-rootless ']' 2025-10-10T02:45:36.4443790Z ++ df -H --output=pcent /media/4TB/docker-rootless 2025-10-10T02:45:36.4445118Z ++ sed -n 2p 2025-10-10T02:45:36.4450852Z ++ sed s/%// 2025-10-10T02:45:36.4452290Z ++ sed 's/ //' 2025-10-10T02:45:36.4498856Z + diskspace=51 2025-10-10T02:45:36.4500216Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-10-10T02:45:36.4501588Z + [[ 51 -ge 70 ]] 2025-10-10T02:45:36.4565368Z Post job cleanup. 2025-10-10T02:45:36.4608511Z Post job cleanup. 2025-10-10T02:45:36.6019780Z Post job cleanup. 2025-10-10T02:45:36.6650730Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T02:45:36.7179938Z Post job cleanup. 2025-10-10T02:45:36.8739426Z Post job cleanup. 2025-10-10T02:45:36.8825683Z Post job cleanup. 2025-10-10T02:45:36.9764174Z [command]/usr/bin/git version 2025-10-10T02:45:36.9822019Z git version 2.34.1 2025-10-10T02:45:36.9856166Z Copying '/var/home/pytorchci/.gitconfig' to '/var/home/pytorchci/actions-runner/_work/_temp/d635d01a-6985-4159-b25f-cc22b42664cc/.gitconfig' 2025-10-10T02:45:36.9868917Z Temporarily overriding HOME='/var/home/pytorchci/actions-runner/_work/_temp/d635d01a-6985-4159-b25f-cc22b42664cc' before making global git config changes 2025-10-10T02:45:36.9870734Z Adding repository directory to the temporary git global config as a safe directory 2025-10-10T02:45:36.9872443Z [command]/usr/bin/git config --global --add safe.directory /var/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-10-10T02:45:36.9953576Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-10-10T02:45:37.0015416Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-10-10T02:45:37.0723394Z Entering 'android/libs/fbjni' 2025-10-10T02:45:37.0843333Z Entering 'third_party/FP16' 2025-10-10T02:45:37.0978323Z Entering 'third_party/FXdiv' 2025-10-10T02:45:37.1100003Z Entering 'third_party/NNPACK' 2025-10-10T02:45:37.1224598Z Entering 'third_party/NVTX' 2025-10-10T02:45:37.1346099Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T02:45:37.1468764Z Entering 'third_party/XNNPACK' 2025-10-10T02:45:37.1626366Z Entering 'third_party/aiter' 2025-10-10T02:45:37.1747125Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T02:45:37.1895187Z Entering 'third_party/benchmark' 2025-10-10T02:45:37.2022667Z Entering 'third_party/composable_kernel' 2025-10-10T02:45:37.2164280Z Entering 'third_party/cpp-httplib' 2025-10-10T02:45:37.2285167Z Entering 'third_party/cpuinfo' 2025-10-10T02:45:37.2406883Z Entering 'third_party/cudnn_frontend' 2025-10-10T02:45:37.2527677Z Entering 'third_party/cutlass' 2025-10-10T02:45:37.2670335Z Entering 'third_party/fbgemm' 2025-10-10T02:45:37.2801717Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T02:45:37.2919995Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T02:45:37.3064395Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T02:45:37.3186689Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T02:45:37.3334998Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T02:45:37.3461805Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T02:45:37.3580254Z Entering 'third_party/fbgemm/external/json' 2025-10-10T02:45:37.3714759Z Entering 'third_party/flash-attention' 2025-10-10T02:45:37.3838498Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T02:45:37.3968562Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T02:45:37.4115149Z Entering 'third_party/flatbuffers' 2025-10-10T02:45:37.4254470Z Entering 'third_party/fmt' 2025-10-10T02:45:37.4380646Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T02:45:37.4513189Z Entering 'third_party/gloo' 2025-10-10T02:45:37.4641162Z Entering 'third_party/googletest' 2025-10-10T02:45:37.4764379Z Entering 'third_party/ideep' 2025-10-10T02:45:37.4881762Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T02:45:37.5021203Z Entering 'third_party/ittapi' 2025-10-10T02:45:37.5144513Z Entering 'third_party/kineto' 2025-10-10T02:45:37.5264155Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T02:45:37.5383744Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T02:45:37.5504462Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T02:45:37.5622503Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T02:45:37.5744632Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T02:45:37.5857381Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T02:45:37.5982404Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T02:45:37.6103082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T02:45:37.6225456Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T02:45:37.6351396Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T02:45:37.6475399Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T02:45:37.6592213Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:45:37.6715428Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:45:37.6854100Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T02:45:37.6974229Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T02:45:37.7102457Z Entering 'third_party/kleidiai' 2025-10-10T02:45:37.7230879Z Entering 'third_party/mimalloc' 2025-10-10T02:45:37.7357824Z Entering 'third_party/nlohmann' 2025-10-10T02:45:37.7481258Z Entering 'third_party/onnx' 2025-10-10T02:45:37.7638163Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T02:45:37.7771790Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T02:45:37.7899885Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T02:45:37.8017650Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T02:45:37.8137308Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T02:45:37.8253719Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T02:45:37.8375243Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T02:45:37.8493163Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T02:45:37.8609972Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T02:45:37.8721711Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:45:37.8843790Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:45:37.8968938Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T02:45:37.9137629Z Entering 'third_party/pocketfft' 2025-10-10T02:45:37.9261238Z Entering 'third_party/protobuf' 2025-10-10T02:45:37.9392362Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T02:45:37.9512569Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T02:45:37.9641010Z Entering 'third_party/psimd' 2025-10-10T02:45:37.9764488Z Entering 'third_party/pthreadpool' 2025-10-10T02:45:37.9884020Z Entering 'third_party/pybind11' 2025-10-10T02:45:38.0004207Z Entering 'third_party/python-peachpy' 2025-10-10T02:45:38.0125819Z Entering 'third_party/sleef' 2025-10-10T02:45:38.0245297Z Entering 'third_party/tensorpipe' 2025-10-10T02:45:38.0363717Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T02:45:38.0482249Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T02:45:38.0599540Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T02:45:38.0717888Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T02:45:38.0824226Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T02:45:38.0998317Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-10-10T02:45:38.1057245Z http.https://github.com/.extraheader 2025-10-10T02:45:38.1081219Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-10-10T02:45:38.1162176Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-10-10T02:45:38.1834359Z Entering 'android/libs/fbjni' 2025-10-10T02:45:38.1906813Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2004424Z Entering 'third_party/FP16' 2025-10-10T02:45:38.2077097Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2174391Z Entering 'third_party/FXdiv' 2025-10-10T02:45:38.2242977Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2341200Z Entering 'third_party/NNPACK' 2025-10-10T02:45:38.2408017Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2507851Z Entering 'third_party/NVTX' 2025-10-10T02:45:38.2574930Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2675794Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T02:45:38.2740988Z http.https://github.com/.extraheader 2025-10-10T02:45:38.2838025Z Entering 'third_party/XNNPACK' 2025-10-10T02:45:38.2905711Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3034226Z Entering 'third_party/aiter' 2025-10-10T02:45:38.3104561Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3201249Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T02:45:38.3265781Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3381530Z Entering 'third_party/benchmark' 2025-10-10T02:45:38.3448607Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3544786Z Entering 'third_party/composable_kernel' 2025-10-10T02:45:38.3611289Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3724968Z Entering 'third_party/cpp-httplib' 2025-10-10T02:45:38.3790562Z http.https://github.com/.extraheader 2025-10-10T02:45:38.3886377Z Entering 'third_party/cpuinfo' 2025-10-10T02:45:38.3951081Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4047059Z Entering 'third_party/cudnn_frontend' 2025-10-10T02:45:38.4109919Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4205358Z Entering 'third_party/cutlass' 2025-10-10T02:45:38.4272987Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4387545Z Entering 'third_party/fbgemm' 2025-10-10T02:45:38.4453700Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4551083Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T02:45:38.4612594Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4705299Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T02:45:38.4765615Z http.https://github.com/.extraheader 2025-10-10T02:45:38.4876057Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T02:45:38.4938634Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5030016Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T02:45:38.5091349Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5205283Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T02:45:38.5266782Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5358481Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T02:45:38.5419366Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5511496Z Entering 'third_party/fbgemm/external/json' 2025-10-10T02:45:38.5575868Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5678814Z Entering 'third_party/flash-attention' 2025-10-10T02:45:38.5746359Z http.https://github.com/.extraheader 2025-10-10T02:45:38.5838021Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T02:45:38.5900628Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6010323Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T02:45:38.6072128Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6190606Z Entering 'third_party/flatbuffers' 2025-10-10T02:45:38.6256058Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6357676Z Entering 'third_party/fmt' 2025-10-10T02:45:38.6422979Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6516798Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T02:45:38.6583670Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6678399Z Entering 'third_party/gloo' 2025-10-10T02:45:38.6745532Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6839852Z Entering 'third_party/googletest' 2025-10-10T02:45:38.6903474Z http.https://github.com/.extraheader 2025-10-10T02:45:38.6999698Z Entering 'third_party/ideep' 2025-10-10T02:45:38.7062149Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7151596Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T02:45:38.7212738Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7327577Z Entering 'third_party/ittapi' 2025-10-10T02:45:38.7393289Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7488224Z Entering 'third_party/kineto' 2025-10-10T02:45:38.7552553Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7642382Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T02:45:38.7704353Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7790577Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T02:45:38.7850302Z http.https://github.com/.extraheader 2025-10-10T02:45:38.7942236Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T02:45:38.8005087Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8096178Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T02:45:38.8157758Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8252429Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T02:45:38.8314559Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8401985Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T02:45:38.8463154Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8561122Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T02:45:38.8620690Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8712300Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T02:45:38.8772435Z http.https://github.com/.extraheader 2025-10-10T02:45:38.8862495Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T02:45:38.8922399Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9015767Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T02:45:38.9079875Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9172082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T02:45:38.9234758Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9321976Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:45:38.9385720Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9485222Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:45:38.9544523Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9651918Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T02:45:38.9712091Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9801315Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T02:45:38.9861245Z http.https://github.com/.extraheader 2025-10-10T02:45:38.9958640Z Entering 'third_party/kleidiai' 2025-10-10T02:45:39.0022475Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0117576Z Entering 'third_party/mimalloc' 2025-10-10T02:45:39.0182334Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0275264Z Entering 'third_party/nlohmann' 2025-10-10T02:45:39.0341513Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0440908Z Entering 'third_party/onnx' 2025-10-10T02:45:39.0505453Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0636444Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T02:45:39.0698475Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0799491Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T02:45:39.0864082Z http.https://github.com/.extraheader 2025-10-10T02:45:39.0956035Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T02:45:39.1016333Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1106513Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T02:45:39.1166805Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1257437Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T02:45:39.1317993Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1407524Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T02:45:39.1467078Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1563324Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T02:45:39.1626730Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1714415Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T02:45:39.1772797Z http.https://github.com/.extraheader 2025-10-10T02:45:39.1864104Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T02:45:39.1925973Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2011219Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:45:39.2071756Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2168045Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:45:39.2228207Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2325828Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T02:45:39.2385797Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2520231Z Entering 'third_party/pocketfft' 2025-10-10T02:45:39.2587851Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2683099Z Entering 'third_party/protobuf' 2025-10-10T02:45:39.2753054Z http.https://github.com/.extraheader 2025-10-10T02:45:39.2845878Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T02:45:39.2907486Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3000639Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T02:45:39.3062088Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3161400Z Entering 'third_party/psimd' 2025-10-10T02:45:39.3226410Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3320060Z Entering 'third_party/pthreadpool' 2025-10-10T02:45:39.3385360Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3478633Z Entering 'third_party/pybind11' 2025-10-10T02:45:39.3542854Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3639088Z Entering 'third_party/python-peachpy' 2025-10-10T02:45:39.3704688Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3798336Z Entering 'third_party/sleef' 2025-10-10T02:45:39.3864090Z http.https://github.com/.extraheader 2025-10-10T02:45:39.3959504Z Entering 'third_party/tensorpipe' 2025-10-10T02:45:39.4026420Z http.https://github.com/.extraheader 2025-10-10T02:45:39.4119288Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T02:45:39.4179051Z http.https://github.com/.extraheader 2025-10-10T02:45:39.4270595Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T02:45:39.4330083Z http.https://github.com/.extraheader 2025-10-10T02:45:39.4421750Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T02:45:39.4484223Z http.https://github.com/.extraheader 2025-10-10T02:45:39.4573543Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T02:45:39.4634438Z http.https://github.com/.extraheader 2025-10-10T02:45:39.4721723Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T02:45:39.4782692Z http.https://github.com/.extraheader 2025-10-10T02:45:39.5201977Z Cleaning up orphan processes